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F. Louis H. M. Stumpers 


F. Louis H. M. Stumpers was born in Eindhoven, 
The Netherlands, on August 30, 1911. He joined 
the Philips Research Laboratories in 1928. From 
1934 to 1937 he studied mathematics and physics 
at Utrecht University. After his doctoral exami- 
nation and research work on Stark effect, he returned 
to the Philips Laboratories as a research physicist 
in 1938. His first work was concerned with the 
application of semiconductors in telecommunication. 
In 1939 he joined the Department of Fundamental 
Radio Research under Professor van der Pol. In this 
group he did work on circuit theory, cables, fre- 
quency modulation, stochastic problems, and noise. 
In 1946 he received the Doctoral degree in technical 
science from Delft University on a thesis about 
frequency modulation. Since then he did work on 
low-noise receivers, general noise problems and 
information theory, and radio-interference. In 
1952-1953, on leave of absence from the Philips 
Laboratories, he was a research associate at the 
Research Laboratory of Electronics of the Massa- 
chusetts Institute of Technology. 

From its conception in 1948, Dr. Stumpers has 
been a member of the board of the Foundation for 
the Study of Radioastronomy in the Netherlands. 


For his part in the work that led to the registration 
of the hydrogen line in Kootwijk, he was awarded 
the Veder Radio Prize in 1951 (together with van 
der Hulst and Muller). 

Since 1950 he has been a member of the Sub- 
committee for Information Theory of the Inter- 
national Scientific Radio Union (URSI). He is 
a member of the Netherlands URSI Committee. 

Dr. Stumpers is vice-chairman of the Netherlands 
CISPR Committee, a member of International 
Subcommittee B (measurements), and a member of 
the Steering Committee of the international CISPR 
organization. (CISPR is the organization for the 
study and abatement of radio interference.) 

He attended the CCIR Conference at Warsaw as 
an expert for The Netherlands delegation. Dr. 
Stumpers is the author of about twenty-five technical 
papers. 

He isa member of the Institute of Radio Engineers 
and of the Administrative Committee of the Pro- 
fessional Group on Information Theory. He is also a 
member of the Dutch Physical Society, the Dutch 
Radio Society, and the study group for statistics of 
the Dutch language of the Dutch Society for 
Phonetics. 
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Information Theory and International 
Radio Organizations 


F. LOUIS H. M: STUMPERS 


In the expanding field of information theory time 
seems to run even faster than elsewhere in this fast 
moving world. In 1950, when the first London 
Symposium was held in the quiet rooms of the Royal 
Society, many of us came under the spell of a bright 
new idea, that might once again make possible a 
new synthesis, unifying many disciplines. Those 
present had the impression one could still know what 
was going on in the field of information theory, the 
men working in it, and its future possibilities. Only 
a few years have passed and information theory has 
had enormous successes, yet some feel disappointed. 
Once again specialization is taking its toll, making 
it impossible to follow more than a part of the field. 
Also, numerous as the papers are—and the yield of 
interesting and worthwhile papers is not less than 
in any other field—practical applications directly 
attributable to information theory are not yet 
great in number. This, together with a sound 
distrust of easy popularity has made for a more 
than critical attitude in some quarters. 

Neither the disappointment nor the unusually 
critical attitude can, in my view, be justified. 
Information theory has stressed the use of some 
new tools and it has given us some new yardsticks. 
They can be applied in a great variety of situations. 
Only the specialist can assess their merit in his 
branch of science, e.g., neurophysiology or optics, 
but we all gain by the understanding of mathematical 
models and methods and by the greater chance of 
cross-fertilization. Fields already well cultivated 
may not yield an extra crop, even so it is worth 
trying. The fact alone that we have better means 
to measure efficiency, does not of itself make the 
efficiency greater. That we had already fairly good 
qualitative means to compare systems, does not 
make a quantitative approach less worthwhile. 

In the special field of radio communication many 
subjects were thoroughly studied before and quite 
apart from the application of communication theory. 
The International Radio Consultative Committee, 
CCIR, already had study groups in which complete 
radio systems as well as questions of bandwidth 
reduction, signal-to-noise ratio, fading, tolerable 


interference, and error detecting and correcting 
codes were studied. The steady progress in these 
subjects must have been favorably influenced by 
the theoretical developments, even if a direct link 
cannot always be shown. Though in CCIR emphasis 
is laid on practice rather than theory, it also issues 
at regular intervals a bibliography on communica- 
tion theory with numerous abstracts. Communica- 
tion theory is directly represented in the form of 
CCIR Question No. 133 and Study Program No. 
86. The question concerns the investigation of tech- 
nical methods for the most efficient use of com- 
munication channels. The study program suggests 
research on the comparison of existing codes with 
theoretical limits and the study of new codes, taking 
into account phenomena peculiar to radio propa- 
gation. As I can give here only an abridged version, 
those of our members interested should look at the 
original documents. Some very good studies in the 
literature provide at least a partial solution to 
these problems. 

Another organization working on the application 
of science to radio is the International Scientific 
Radio Union. Since 1950 it has had a special com- 
mission for information theory under the chairman- 
ship of Professor van der Pol, Director of CCIR until 
1957. For the meeting in Boulder this summer the 
chairman has drawn attention to the problems of 
television-bandwidth reduction; error-detecting 
codes combined with automatic repeat request 
procedures; signa] form, duration and bandwidth; 
and antennas for transhorizon communication. 

A few of the many radiocommunication questions 
in which the ideas of information theory are fertile, 
have been mentioned above. Other subjects, ¢.g., 
speech synthesis and visual perception are of obvious 
importance to this field. 

Theoretical and fundamental studies come first, 
and much work remains to be done, but many will 
judge us by the practical applications we can find. 
Let us give special care to this. It is fair and right 
also to look outside the radio field, but in this Pro- 
fessional Group, radio still has the claim for first 
attention. 
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On the Detection of Stochastic Signals in Additive 
Normal Noise—Part I 


DAVID MIDDLETONT 


Summary—The problem of optimum and suboptimum detection 
of normal signals in additive normal noise backgrounds is examined 
by the methods of statistical decision theory. Some general results 
for optimum receiver structure, error probabilities, and average risk 
are obtained for the case of colored noise backgrounds. A detailed 
study of threshold reception in white-noise backgrounds is included, 
along with calculations of Bayes risk, bias terms, and minimum 
detectable signals for broad-band RC-noise signals and narrow- 
band, 7.e., high-Q, LRC-noise signal processes. Optimum detector 
structures for signal processes with rational intensity spectra are 
also determined for the white noise case, and particular attention 
is paid to optimum receiver design in terms of physically realizable 
elements. Suboptimum receiver structure and performance are con- 
sidered briefly, as well as a number of limiting cases of more special 
interest. General methods of attack are illustrated, with details given 
in Appendixes I-V. Application of the results to a variety of com- 
munication problems is indicated. 


List oF SYMBOLS 


a, = input signal-to-noise power ratio 

[Be |p = m* iterated kernel for D(é, T) 

C = optimum weighting for discrete sampling 

C., Cs, etc. = preassigned constant costs 

D = kk 

D7(y) = a Fredholm determinant 

F(V\S) = conditional probability density of V, given S 

Fx (dé) ny, san, Py(é) = characteristic functions 

Q(x) = error function: O(x) = (2/-Wm) fze “dl 

h(x)r, hy(v) = weighting functions of linear filters 

Ky = noise covariance matrix; ky = normalized co- 
variance matrix 


Ky, = signal covariance matrix; Ky; = normalized co- 
variance matrix 
K(t, uw)s,y = covariance functions of signal and of noise 


K = acost ratio, or threshold 

L, = n* semi-invariant 

log £,(V) = a suboptimum system 

N = noise vector 

N., N,, S., S; = components of the noise and signal 

P,,(x), Q,(«) = probability densities for log A,(V) 

P(x), Qr(x) = error probabilities for log A;, continuous 
sampling 

R(c, 6) = average risk 

@*,, 4 = normalized Bayes risks 

S = signal vector 

T = duration of the observation period 

V., V. = components of the received data wave 


* Manuscript received by the PGIT, March 11, 1957. The 
present paper is based directly on Res. Memo. RM-1770 (August 
6, 1956) written by the author as a consultant to The RAND 
Corp.,¥Santa Monica, Calif. Permission to publish this material is 
gratefully acknowledged. 

7 49 Lexington Ave., Cambridge 38, Mass. 


V,(t) = output of a predetection filter 

V = data vector 

v = normalized data vector = V/W}/" 

Won, Wos = spectral densities of the noise and signal 
processes, defined on the basis of a single-sided. 
intensity spectrum and in terms of simple frequency 


(in eps) 

X, Y, Z = (column) vectors 

gid 9268) gfe) g®) gh, igh t=) arouments ror me men nam 
probabilities 


27(t), pr(t, wu) = solutions of certain integral equations 
a, 8 = conditional error probabilities 

a*, B* = Bayes (conditional) error probabilities 

a*,, 8%, = Minimax conditional error probabilities 

Io, 14, Go = bias terms 

afi = Wos/Wow 


d(y | V) = a decision rule 

n = anumerical fraction (0 S 7 S 1) 

A,(V) = generalized likelihood ratio, discrete sampling 
A,(V) = generalized likelihood functional, continuous 


sampling 
\ = wprT’ = normalized observation period 
MN = eigenvalues of the matrix G 
[iM |g = eigenvalues for the kernel G 
uw = p/q = ratio of @ priort probabilities of signal and of 


noise 

o. = effective input signal-to-noise (power) ratio 

a = eT Wow 

@, = structure factor for optimum system, discrete 
sampling 

@, = structure factor for optimum system, continuous 
sampling 

WV, = structure factor for suboptimum system, discrete 
sampling 

WV, = structure factor for suboptimum system, continuous 
sampling 

Ws, Vy = mean intensity of signal and of noise 

®p = parameter proportional to the spectral width of 


the signal process. 


I. InrrRopuUCcTION 


LTHOUGH deterministic signals—those whose 
general waveforms are known a priori at both 
transmitter and receiver—occur frequently in 
communication problems, signals with a wholly random 
character, while less common, are also of considerable 
practical interest. For example, random signals occur 
in the course of scatter-path transmission when originally 
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terministic signals are converted by the medium of 
opagation into (narrow-band) random waves whose 
velope and phase structure depend in part on the 
iginal signal. In the case of frequency-shift-key (fsk) 
mmunication, for instance, where the transmitted 
ive consists of sequences of sinusoidal wave-trains of 
ute duration at one or the other of two chosen carrier 
quencies, the original wave-trains are transformed 
to corresponding sequences of narrow-band, normal 
ise waves, centered about the original carrier frequency 
cations. Even if the medium of propagation does not 
ansform an original signal ensemble in such fashion, 
e signals themselves may possess such a complex time- 
ructure that a normal process is a reasonable approxi- 
ation to the actual signal set. 
Other examples of random signal processes are easily 
und; waveforms of this type are to be expected in 
dio-spectroscopy, where now the source is represented 
7 one or more spectral lines. Because of collision-broaden- 
g and other atomic effects, these lines have a finite 
idth, and the radiation process itself by which they 
e produced may be regarded as a normal process from 
e macroscopic point of view. A similar type of phe- 
ymenon occurs in radioastronomy. There the signal 
urces are either radiostars or gas clouds whose signals 
e essentially normal noise distributed with varying 
tensities over the radio spectrum. The former produce 
‘oad-band noise, while the latter generate narrow-band 
‘ocesses. Still other examples of signals described by 
1 entirely random process can be readily constructed, 
ith counterparts in actual physical situations. From the 
ewpoint of communication theory, a task of central 
1portance in all these examples is the detection of the 
‘esence (or absence) of (one or more) such random 
mals against an appropriate (normal) noise background. 
Compared to the studies of optimum and suboptimum 
tection of deterministic signals’ not much attention 
ypears to have been given to the corresponding problem 
random signals. Of previous work, the most significant 
this regard is that of Davis [2], who has given a rigorous 
id rather general discussion of the problem but without 
staining results in closed form or results that are readily 
isceptible of computation and interpretation as specific 
ceiver systems of realizable elements. Somewhat later, 
their analysis of the optimum sequential detection of 
mnals in noise, Bussgang and Middleton [3] considered 
‘iefly the question of Gaussian (broad-band) signals in 
milar, normal noise, but apart from the present discus- 
m, to the author’s knowledge it is only in the closely 
lated work of Price [4] that an extensive treatment of 
e problem appears.” 
Although this investigation and Price’s interesting 
1d important study have many points of contact and 


1See Bibliography [1]. In particular, see references therein 
g. 250): (1.9), (1.10), (1.10a), (1.11), G12), (1.14), (1.15)-(1.20), 
.22)-(1.26),(1.30)-(1.84), (1.40), (1.48)-(1.50), (3.3), (8.12)-(3.17), 


Be : 
2 See also the related work of Turin [20]. 


profit mutually from their common origin in the under- 
lying problem of detecting normal noise in normal noise 
backgrounds, their approaches and emphases differ to a 
considerable extent. Price’s work pursues the question 
of scatter-path communication to a much greater degree 
and considers more complicated conditions of operation, 
including fading, than does this study. Moreover, his 
treatment of narrow-band waves, based on the approach of 
Kae and Siegert [5], and of Emerson [6] in certain in- 
stances, is extended to include those more complicated 
situations and is considerably more thorough than the 
comparable analysis here. On the other hand, our treat- 
ment stems from the general approach of decision theory, 
as applied to detection, and in addition, includes an: 
introductory account of the problem of colored noise in 
colored noise backgrounds, although no numerical results 
are presented at this time. We also consider the problem 
of broad-band noise signals, where attention is focused 
mainly on threshold reception. For these reasons, then, 
the two treatments supplement each other to a certain 
extent, with enough overlap to establish convenient 
analytical relations between the various different special 
problems examined in each study. 

The purpose of this paper, accordingly, is to examine 
optimum and suboptimum systems for the detection of 
normal noise signals in additive, normal noise back- 
grounds by the methods of statistical decision theory 
[1]. Apart from special problems involving specific broad- 
and narrow-band signals and the systems appropriate 
to them, our aim is first, to outline a general approach 
for situations of this class in the usual cases of finite 
observation periods (0, 7’) when the background noise 
is not ‘‘white,” 7.e., does not possess a uniform spectrum 
at all frequencies; and second, to illustrate a new method 
of obtaining specific results in the important cases of 
threshold reception where the input signal energy is at 
most comparable to that of the interfering noise. Although 
this method is approximate, it is particularly well suited 
to the weak-signal cases and can be employed where an 
exact approach cannot be pressed further analytically. 

Apart from these rather general aims, a number of 
new and more special results are obtained: 

1) Solutions of the integral equations appropriate to 
signal ensembles with general rational spectra when the 
background noise is white; 

2) Discrete structures for the general colored-noise 
case, as well as for white noise interference; 

3) Minimax detection processes, when the a priort 
probability of signal and noise and of noise alone are 
unknown; 

4) Specific, realizable receiver structures, in terms of 
linear (time-invariant) matched filters and zero-memory 
nonlinear elements, or time-varying linear elements and 
multiplier units; 

5) A number of limiting conditions or operations, such 
as independent sampling (in the discrete cases) and 
infinite observation times. 

Before we proceed to the main topics, let us first indicate 
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the assumptions on which the analysis itself is based: 

1) Both the noise background, N(f), and the noise- 
S(t), belong to independent normal random 
processes, with zero means and with covariance functions 
K y(t, t2), Kg(t, tf), respectively. In much of our later 
discussion we shall further assume that N and S are 
stationary processes, so that Ky(t,, tb) = Ky(|t, — &)), 
ete., although this does not greatly reduce the generality 
of the treatment. 

2) The background noise, N(¢), may be ‘‘white,’’? with 
finite spectral density Woy, or it may be ‘‘colored,”’ with 
some nonuniform spectrum of bounded total intensity. 

3) The noise-signal, S(¢), may be spectrally broad- or 
narrow-band; in any case, both signal and noise are 
additive processes. 

4) Finite observation periods (0, 7) are postulated, 
except in the special situation of semi-infinite observation 
times (7 >). 

Our basic problem, of course, is the binary one of 
determining whether or not the signal, S(/), is present in 
the noise background. Statistically, this is equivalent to 
testing the hypothesis H, of S + N against the alternative 
H, of N alone. Moreover, we shall take as our criterion of 
performance the measure of average risk or cost, R. For 
optimum performance, receiver structure and operation 
are determined by minimizing this average cost [1]. The 
quantities of chief interest to us, accordingly, are: 

1) The structure of the receiver, which indicates how 
the received data is to be processed for a decision as to 
the presence or absence of a signal. 

2) The “bias” term, which affects the decision threshold 
in actual operation. 

3) The error probabilities, in terms of which average, 
or minimum average risk (#, or R*), is calculated, and 
by which, in turn, performance of the system is evaluated. 
On the basis of R and k* both optimum and suboptimum 
systems may be compared, with respect to a common 
criterion.” 

Finally, we point out that there are important practical 
problems still remaining, to which the present effort is 
preliminary: solutions for colored noise backgrounds, the 
case where rms input signal level is unknown except for a 
distribution of possible values, the details of strong- 
signal operation, and certain useful suboptimum systems, 
all remain for a later study (to appear as Part II when 
completed). 


signal, 


Il. GeNERAL STRUCTURE OF THE Optimum Drrecror 
(DIscRETE SAMPLING) 


A. Decision Theory Formulation 


Following the earlier discussion of Middleton and Van 
Meter* we find that the formulation of our present problem 
of determining the presence or absence of a random 


3 See Bibliography [1], in particular, (3.5), (3.6). 
4 See Bibliography [1], (3.1) and (8.2). 
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signal, S(¢), in an (additive) noise background may be 
summarized as follows: . 

The optimum system is found by minimizing the average 
risk R(o, 6), where 


R(o, 0) = [ dSo(S) [ avervis) | ariWCC, v ay. 
2 iM A 
(1) 
Here, specifically, we have 


V = (Vi, °:> , Va); Ve = VG) @ Set of oi scamples 
data values, arranged in order of increasing 
time, t.e., t; > t;, if 7 > 7. For convenience 
sampling takes place at equal intervals in the 
observation period (0, 7’), so that t, = kT/n 
(S,, --+ , S,), the sampled signal vector; 

the conditional probability (density) for V, 
given S 

the a priori probability density for S. Here 
a(S) qo(S 0) + pw,(S), where gp, 
q(= 1 — p), are the a priori probabilities of 
S + N and N alone, respectively; w,(S) is 
the density function for S, given the presence 
of Sin noise N, where, of course, fw,(S)dS = 1. 
(N,, -::, N,,)—the sampled background noise 
vector. 

(v1, > Ym)—a set of m decisions; here 
y = (Yo, ¥1), Corresponding to yo ¢ N alone, 
vy, e S + N. Thus, our decision problem is a 
binary one (m = 2), as mentioned above. 

a set of costs, which are assigned to each 
possible combination of signal input to the 
receiver and decision output. In particular: 


0; Yo) 
0; 71) = 


for the four possible combinations of input states and 
final decisions. C,_,, Ci_, are the costs associated with 
correct, or ‘‘successful” decisions, while C,, Cs are the 
costs of incorrect, or ‘‘unsuccessful” decisions. From the 
definition of “successful” and ‘‘unsuccessful,” it is clear 
that C, > Cia, Cs > C,-¢. In addition, we set C,_, > @ 
C,_s = 0, although this is not a necessary restriction. 
6(y|V) = a set of decision rules, by which decisions 
are made on the basis of the received data V. 
In detection, 6 is a probability, and for our 
problem we observe that there are but two 
situations 6(y0|V), 5(y,/V), subject to the con- 
dition that a definite decision is actually made, 
viz: 


Sie 
F(V|S) 


aS) 


CS, y) 


C(s 
C(s 


Cas: 
Ce 


C(S # 0; Yo) 
C(S # 0; ¥1) = 


@, 
On 


(2a) 


d(yo | V) + di | V) = 1. 


The average risk may be expressed in terms of 
the error probabilities a, 8, according to 


Ro + gala — Ci-a) + pB(Cs — Ci-5), 
with Ro = qCi-. + pCi_», and 


(2b) 


a, 6) = (3) 


057 
2 [ rw | 0) 6(y, | V) AV; 


p= [ PW|S). a| Wav. 


The a and £ are the class conditional prob- 
abilities, respectively, of deciding a signal is 
present, or that noise alone occurs, when in 
either instance the reverse is actually true. 
Thus a and £8 are the probabilities of Type I 
and Type II errors, based on the condition of 
noise alone, and of signal and noise. The 
average (), in 8 is with respect to w,(S), 
governing S, or if S is deterministic, any 
random parameters in S. 


Minimization of the average risk R,, (3), follows from 
suitable choice of decision rule. It is found that’ the 
ptimum detector structure is the generalized likelihood 
utio. 


AV) = KEW | S))4/ FV | 0), 
nd the optimum decision process itself is 
ecide y,, 2.e., S + N, when A, => 
ecide Yo, 1.e., N, when A, < K 


Ce = Case 
OF Pas Cie. 


Pal). (6) 


[ere the cost ratio ® is called the threshold and depends 
nly on the preassigned costs. (The subscript n on A, 
aminds us that A,, and F = F,, also are functions of the 
ample-size or observation period.) Accordingly, the 
inimum average risk, or Bayes risk R* = min; R (c, 4), 
ecomes [(3), (4)] 


*(o, 6*) = Ro + qa* (Ca =O a pB* (Cs = Cy-8), 


(7) 


here 6*(yolV) = 1, if An < %, (6) and (2b). The (con- 
itional) error probabilities for this minimum average 
sk system are obtained from (4), from the nature of the 
ecision rule 6*, above. Now, in application it is more 
mnvenient to deal with some monotonic function of A, 
s the receiver operator on the data V. The most useful 
noice is the (natural) logarithm, so that henceforth here 
1e optimum detector is, from (5), 


log A,(V) = log uw + log {(F(V|S))./F(V|0)}. (8) 


‘he error probabilities are now determined from 


fee) 


Q,(x) dx; 
log® 
here Q,(x), P,(%) are respectively the probability 
ensities of log A,(V) with respect to the hypotheses 
(a eee 


sr=[ P.@dx, ©) 


5 See Bibliography [1], (3.2). 
6 For details, see Bibliography [1], (3.5). 


aya, (5). 
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OO 1, a(t — log A,(V))F(V | 0) dV (10a) 
iy! 
P,(x) = | 6(z — log A,(V))UF(V | $)), dV. (10b) 
r 


As has been shown previously’ for particular choices of 
the cost ratio K one gets either a Neyman-Pearson 
detection system, or Siegert’s Ideal system (the latter for 
K = 1, corresponding to minimization of aq + Bp). 
Note, incidentally, that whenever we talk about error 
probabilities, we are implying a decision process of some 
sort, and consequently, some measure of value associated 
with the possible decisions, regardless of which evaluation 
function may be chosen (here embodied in the constant, 
preassigned costs). 


B. Colored Notse-Signal in Colored Noise Background 


For the additive, independent normal noises assumed 
in Section I, we have 
FW.I.S) = .WAN-=— S)y =n), Get kes: 
-exp {—9(V = S) Ka (VS) nD) 


with F(V\O) = W,(V)y obtained directly from (11) on 
setting S = 0 therein; Wy is the distribution density of 
the noise alone. Also, we have 


w,(S) = (2n)""?(det Ks)” exp {—38SK;'S} (12) 


for our normal noise signals, where now (with N = S = 0) 
Ky = [Ky(¢;, &)] = (NG) N(4)]; 
Ks = [Ks(¢;, &)] = [S@)SG)] 


are the covariance matrices of the background noise and 
of the possible incoming signal. In both instances Ky and 
Ks are symmetric. By definition, it follows here that 


(13) 


FV |S). = [ w(S)W.V - S)\d8, (14) 
so that using (11) and (12), with the aid of [7]: 
ff exp fitx — 4x Ax} dx 
= (2n) det All” sexps) —=tA{) (15) 
we get finally 
(F(V | S)). 
_ exp {—3VKy'V + 3VKy(Ks' + Ky’) *Ky'V} 16 


(Qn)? \/ det Ke KK, LE Ky 
The optimum receiver structure for detection, (8), becomes 
log A, = log uw — 4 log det (I + KsKy’) 


+ 1V{K, (Ks + Ky) Ky }¥. (G7) 


7 Bibliography [1], (3.3). 
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Note, incidentally, that since N and S are both Gaussian 
and independent, we may write alternatively 
(F(V | S)), = W(V) sen 
_ ep {—3W(Ks + Ky 'V} gg 
(Qn)? det (Ks + Ky) 
from which it follows that 


log A, = log » — 3 log det (I + K Ky’) 


+ 4V(Ky' — (Ks + Ky)")V, (19) 
with the consequent identity 
C = Ky’ — (Ks + Ky)’ = Ky'(Ks' + Ky’) "Ky. 
(19a) 
At this point we introduce the normalizations: 
ks = Ws Ks; ky = yw Ky; 0) = s/n, (20) 


where Ws and wy are respectively the mean intensities, 
S’, N’, of signal and noise, while aj is the input signal- 
to-noise power ratio. The optimum detector structure, 


(17), then becomes, for discrete sampling, 


log A, = log » — 3 log det (I + ajD) 


i be ) [Cy Die eee) 
where v is the normalized data V/Wy’ and C is given by 
(19a). The terms Ty = log uw» — 4 log det (I + aD) 
(196) are called the bzas, while 


@, = pyvCv = VCV, (22) 


is termed the structure of the optimum detector. Thus, 
more compactly, our optimum system is 


log A, = Typ + 44, = T,(ag;n) + 44,(a5, Vv). (23) 


C. Weak- and Strong-Signal Cases 


Observe first of all that since reception is necessarily 
incoherent here® and because of the normal statistics of 
signal and noise, the detector structure involves at most 
a generalized autocorrelation of the received data V with 
itself, through ®, = pyvCv. Not only for weak signals 
(a2 <1), but for strong ones as well (a? > 1) is this true: 
the optimum receiver is essentially an energy detector. 
Let us, however, for the moment consider the weak- 
signal case. For this it is convenient to use the first form 
of C = C(a;), (19a), which on expansion yields for the 
structure factor 


®,(a0; V) = ap? (—1)'al Dt fv 
1=1 


= aov{Dky }v + O(a), a<K1, (24) 


revealing the expected dependence on the input signal- 
to-noise power ratio a; (instead of ado, for coherent 
systems). 


8 Bibliography [1], (3.8). 
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The bias term, Ip, may be similarly developed. From 
the results of Appendix III, (197) we have 


tau) a,” trace D” 
2m 


To(as) a log Mote 2 


= log u — (a;/2) trace D + O(a) 


and the series is convergent, provided aj | dj” | < 1, 
where \\” is the largest eigenvalue’ of the matrix D = 
k,ky’. [The condition for the convergence of (24) is 
that &,(a5) be convergent, with respect to either hypo- 
thesis H, or H,, see (10a), (10b)]. 

In the strong-signal situation we may use the second 
form of the identity (19a) for C(a>) to obtain the following 
expression for the optimum detector’s structure: | 


(25) 


@,(ao; V) & Vky'v 


= Vi 


as "(—1)"ky (D")'v, a >1, @@ 
which shows that to a first approximation, the structure 
®, is independent of aj. This, as well as the dependence 
of ®, on aj in the weak-signal cases, is an example of the 
phenomenon of modulation suppression whereby a weak 
component in the presence of a strong is made still weaker 
in the course of rectification. Here the noise is suppressed”? 
(V ~ S) while in the threshold situation it is the signal 
that is suppressed.’* The bias term, on the other hand, 
becomes to a first approximation 


Ty & log » — 4 log det (aD), Get (27) 


For a development in inverse powers of aj, (26), some 
procedure other than (197) for the weak-signal case must 
be used. 


Iil. Error PROBABILITIES AND B1As 
(DiscrRETE SAMPLING) 


A. Eigenvalue Method 


Before considering specific conditions of operation, let 
us find the error probabilities, and hence the minimum 
average risk, associated with our optimum detection 
process. We consider first discrete sampling. From (9), 
(10), and (7) we may obtain the Bayes risk, with the help 
of (23) for the particular problem of the present paper. 
We may write the following relations for the characteristic 
function of Q,(v), P,(v)—the distribution densities of 
log A,, now considered as a random variable—when V 
is allowed ensemble properties: 


Hl 0. ek deuce 


[Palade de = F.C) son (28) 


_ *In all our work it is assumed that the eigenvalues of the noise, 
signal, bias, and other pertinent matrices of the theory, are distinct. 
10 The noise, of course, still influences the structure of the receiver, 
through k,; similar remarks apply for the signal in the threshold 
cases. 
1 Bibliography [1], (3.8). 


V5? 
om (10), and (23), or (27), and the fact that 
z — log A,(V)) 


co 


= [ ew {-# — 1-3, 9) 2. ep) 
hus we have 

Fi(té)y = (exp {+2&(To + [py/2]¥Cv)})u,; 

FQ) sin = (exp {2&(To + [Ww/2]¥Cv)})u,, (30) 


here the average with respect to the hypothesis H, is 
ade with the weighting function F(V) | 0), and for H,; 
ith the weighting function (F(V | S)),. Applying (11), 
= 0, and (16) or (18) [S ¥ 0], with the help of (15), we 
at directly 


F (if) ~ = e****[det (I — itkypyC)]"”, (31a) 
Aié) sin = e'***[det (I — déWy(agks + ky)C)]}-”, (31b) 
hile the corresponding probability densities are 
A © gre a) dé 
2) = es (den i= déyak C2 On (G22) 
co ee) dé 
O- | Gaara mo 2) 


he error probabilities themselves follow from (9) after 
10ther quadrature. At this point it is convenient to 
rite [see (19a)] 


wkvC = I — (I+ a,D) * = Gy; 
Wy(acks + ky)C = 


There are two ways of simplifying the determinental 
‘pressions in (31) and (32). The first method is that 
nployed originally by Kac and Siegert [5], which, in 
fect, diagonalizes G and replaces the determinant by 
s eigenvalue product, (146). Thus, we can write for the 
sterminants in (32a) and (32b) above 


aD = Ggyy. (33) 


det I — #¢G) = [[ @ — 7e{). (34) 
owever, even with this reduction, the expressions for 
, and P,, above cannot be evaluated precisely. At best, 
e can use the method of steepest descents to obtain an 
yproximation result which, fortunately, is useful in 
reshold cases. For details, see Appendix II. 

For the binary detection problems considered here it 
also possible to obtain P,(x) directly from Q,(x), and 
ce versa, with the help of the fact that [21] 


eis). |" 
| | F(V|0) F(V\0) dV 

= [ [SES [ wrvisy.av aaa 
1ere m = 1, 2, --- . In the logarithmic case here, ¢.g., 


= log A,. Starting with (82a), for example, and using 
3), straightforward manipulation yields 


Middleton: Stochastic Signals in Additive Normal Noise—Part I 91 


Q,(x) = [det (I + a2D)]?” 


00 eit Pe-®) dé 
“ih (det [I — (2 — 1)ajD])'”” 2’ 


which with an obvious substitution becomes finally” 


Q,(“) = [det (I + aD) er 


(34b) 


io) pee =) dé’ 
a {det (I — ié’a2D)}" Qn 


= le Pa). 


(34c) 


B. Trace Method 


The second method, which to the author’s knowledge 
has not been exploited in problems of this type before, 
is the so-called trace method (See Appendix II), which 
makes use of the basic identity 


det (I + yG) = exp ie yy =o y” trace ar}, (35) 


m 
where the series converges, provided | yA{® | < 1, with 
Mi the largest eigenvalue of G. This approach is dis- 
tinguished by the fact that it does not require detailed 
knowledge of the eigenvalues of G, only of the various 
traces of G, G’, etc. Here y = —7é, and both members 
of (35) are to be raised to the — 1/2 power in (82). 

When aj is small, we may replace the determinant by 
the trace expansion (35) above, approximatley for all & 
in our expressions for the probability densities P,, Q, 
of (32). Retaining terms in £, £ only in the exponents we 
develop the integrand in a series in (7€), which upon 
integration with the help of (188) gives us finally 


Q (x) ae ee ii, o( Zs) (36a) 
ew Nag 2 / N° 
—27 s+n/2 
ey 3/2 
Pe) — Oriana + O(L;/ La) S4N* (36b) 


Here L,, Lz are the second and third semi-invariants 
[see (180) and (186a)] and 


ie ie Doc Ace Gnen 
S (trace Gy/2) 7 


\(Le)~ = 4 trace Gy 


i Nee brace: Gree 
(3 trace Gg.n)” , 


SeeNe = and 


(37) 


CEs) eam = Stace Guy. 


As expected, the distribution of the logarithm of the 
detector structure, including bias, (23), is normal, with 
mean Ty + (1/2) trace G and variance (1/2) trace G’. 
Correction terms, showing that the series is essentially 
of the Edgeworth type, may be found from (189) and 
(189a) with (186a) and the appropriate expressions, 
(33) for G. 


The author is indebted to Dr. Price for calling this to his 
attention. 
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The Bayes error probabilities in the threshold case 
accordingly follow from (87) in (9), and are 


on FOG. V2), 
* 1 (8) in/2 

Be ae Oe. V2) (38) 
when 0 (uw) = (2/~Wr) [% e” dt is the familiar error 
integral, and 
log = Tor) trace: Gime 

{trace Gy) /2}' : 

te) — log K — Io — (3) trace Ggin 
ae {trace Geix/2}*” 


Co 
2 


on 


(39) 


Correction terms are easily found from (189) on inte- 
gration. Note that by an expansion of the type 


{det (I — i#G)}"'? = I] (teh 3) 


Be i) ee) (40) 


and using the relation >>"_, X" = trace G”, see (148), we 
can also obtain the weak-signal results above, (36)-(39). 
As we shall see in Section VII, there is an alternative 
approach when the signal process is narrow-band that 
enables us to use the Kac-Siegert method of resolution 
into eigenvalues and so obtain exact results for all aj. 
For details, see Appendix I. 

There remains the bias term, Io, for this situation of 
discrete sampling. The exact expression is given by 


Ty = log w — (1/2) log det (I + aD), 


Di keke ow oF (41a) 


Jorn + [] + oanier™, (41b) 
[(21) et seqg.] and for the weak-signal cases we may use 
(197) once more, to write 


a 
— ~ trace D 


To = log u 9 


4 
zie os trace D® + O(a), 


7 (ae < 1). 


(41¢) 
As will become evident when Ty is used in the expressions 
above for 2(*, 2, it is necessary to consider terms 
O(a) as well. Thus, for threshold reception, discrete 
sampling of colored noise signals received in colored noise, 
we get with the help of (192a) in (89) 


nm 


log (K/u) + dS (- (ais trace D* 


fos) = a : A 1/2 
is i!) by ,@o trace D ) 
q=1 


(a) 


) 


2 
(one 1) (42a) 
and from (193). 
. = (—1)7" 2a qa 
log (K/u) + >> ~=— ai‘ trace D 
Lie ——_ a=2 2q - 
on as|(trace D?/2)]'”” 
(a <1). (42b) 


Jun 


The Bayes error probabilities are then obtained whe 
(42) is used in (38). Estimates of the useful ranges o 
a*, B*, (38), may be established with the assistance o 
(189), (189a), where the various semi-invariants may b 
evaluated from Appendix III-A, especially (192a) 
(192b), and (198). 


IV. Optimum RECEPTION WITH COLORED NOISE 
BACKGROUNDS (CONTINUOUS SAMPLING) 


A. Calculation of Optimum Detector Structure and Bias 


While we shall not pursue in any detail the questio1 
of colored noise backgrounds in the present paper, leavin; 
this for a possible later study (Part II), it may be in 
structive to indicate the initial steps for this more genera 
theory when continuous, rather than discrete sampling 
procedures are employed. We begin with the structur 
term ®, = VCV, (22), and by certain matrix manipu 
lations, followed by an appropriate passage to the limi 
n— ©, with 7 fixed, we obtain the continuous version 
of the discrete cases outlined in Section III. 

Here let us use the second relation for C in (19a) 
writing 


Xeekv (43 
we see that the structure factor , becomes 
&, = X(Ks' + Ky’) 'X 
= XG'X, with G=K, + Ky. , 
Now we set 
Y= GX, cand, 3 Xi= (GYt ke 
Our next step is to define a new column vector Z by 
Zi San Kee Ne (46 


from which we see with the help of (43) and (44) i 
Y = G ‘X that, alternatively, 


Thus, we can write the structure factor in several equiv 
alent ways: 


Oo, = KY a Khe ee (48 

The matrix equations from which the integral equation 
are derived in the continuous case are given by (43) an 
(45) in conjunction with (46). We have 


At this point we assume that V(¢) and a finite number o 
its derivatives are bounded and continuous functions o 
t and that Z(t) is also a bounded, continuous function 
such that dZ/dt exists in the interval (0 —, 7’ +), except fo 
possible 6-function singularities at ¢ = 0,¢ = T. Then b: 
letting the maximum interval (¢; < ¢ < ¢t; + 6) approacl 
zero asn — ©, (6 —> 0), with lim,.., 6-0 (j6 = t;) 31 


v7 


nt all ¢in (0 —, T +), we note formally that 


| fan (et Xt) ax (8) nahn 
| Jim (2a = 2h we) xy (60h) 


6-0 


' 
| 


hen (49) goes over into the pair of integral equations 


V(t) = it - TO aXe [ i‘ Roan ies 


T+ 
if Kb, waa) dy = 


olution of the first relation, (51a), may be carried out 
y the methods described in Appendix IV-B, (214) 
nd (215), when the signal and noise processes possess 
tional spectra. Repeating this approach for the second 
lation, (51b), gives us ¢(2) as well. Now from (48) we 
ave for the structure factor under these conditions 


vs [VO azo 


m®,— ®, = 


00 


ve f Veoso dt, o» (52a) 


Sie | | ORL WrON di du. G20) 


Ve emphasize that (50)-(52) are a set of purely formal 
perations, which must be justified in particular cases. 
lowever, the kernels K.,, Ky encountered in our present 
hysical problems appear such as to permit definite and 
nique results on passage from the discrete to the con- 
nuous sampling processes, as evidenced by solution in 
articular instances. (See Appendix IV for white noise 
ackgrounds.)*° 

The bias (41) now becomes for continuous sampling 
rom the argument presented in Appendix I-A), 


= log uw — 4 log D,(as) 


= log uw + log i Ceraine ly) (53) 


here ©,(a;) is the Fredholm determinant lim,..., det 
+ @D), D = ksky'. Here [\{” |p are the eigenvalues 
‘ D in the limit (n — o@), see Appendix I-B, (147) 
seq. In threshold reception, provided aj |[\7]p |" 
\ppendix II, (161la)-(161c)] we may alternatively repre- 
nt I'p in terms of the iterated kernels, 


OP 
tee -f Dis, to) D(te, ts) °° -Dtms ty) dt, oo lin 


ith the aid of the basic identity (161b), which is here 
ecifically, for y = aj, 


(54) 


13 For a discussion of this point, see Bibliography [1], (8.8). 


vy ee [K s(t, u) + Ky(t, wou) du 
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foo) 


= logu + ae” (Be ps 


aa | [islet ee) 


See Appendix III-B and (198). 
We can now indicate the complete structure of the 


optimum detector when continuous sampling techniques 
are employed. Instead of the function log A,(V) of V, 


(51a) 
O= <1 = fa) 


(51b) 


we have instead the functional of V(Z), 
lim log A, > log Ar(V(t)) = log uw — 3 log D,(as) 
+ 46,(V(é); a9), all a > 0, (56) 


from (53) and (52). In the weak-signal situation we can 
use (55) alternatively. 

In a similar way, instead of Q,(x) and P,(ax) as the 
distribution densities of log A,, (10a), (10b) and (32a), 
(32b), we have for the corresponding distribution densities 
of the functional log A7, the relations 


[fz gas 2) dé 
Deo Oe 


Q(x) = 


=u ir git Pome) I (i Seen Oe (57a) 
ere 2) dé 
es f Delay Be 
= fie i&(To—2) I ch AEN eel een eae (57b) 


The first expressions in (57a), (57b) are the analogs of 
(47a), (47b) for discrete sampling. Here [\{”]y, and 
[MS] sy are the eigenvalues of the integral equations 


[ G(t, 7)w,senfi(t) 2 dt = [ie linestenia (Olas 


(Ossie (ES) 


where Gy, Gys.y, (383), become the kernels G(¢, 7)y, sin 
here; see Appendix I-B and (152a) et seq. 


B. Error Probabilities; Threshold Reception 


The Bayes error probabilities a*, 6* follow at once (as 
a pair of quadratures), if we insert (57a), (57b) into (9), 
in place of P,, Q,. However, the same difficulties are 
encountered here as in the discrete situation, since it is 
not possible to evaluate Q7(x), Pr(x) in general. The 
threshold cases can be treated, on the other hand, with 
the aid of the basic identity (161b), vez. 


Dr(—1é) = exp (Gh : aoe yc}, 


m=1 


E|Div]e|" <1, (69) 
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where [Bi], is the mth iterated kernel of Gy or Ggyy, 


(33), as these matrices go over into G(t, 7)y,s+. for con- 
tinuous sampling; [\{”’]g is the largest eigenvalue of 
(58), (149a). Replacing the Fredholm determinant 


by (59) and retaining terms 0(£, £”) only in the exponent, 
we may evaluate (57a), (57b) approximately, to obtain 
the analogs of (36) in the discrete case, v72z: 


Q7(2) SA OxXp {-(ey i /2} {2r(Le)w}'”?; 
{—2svn/2} {2r(Le)s.n}”, 


where now, in place of (37) for 2°” 


P(x) ~~ exp (60) 


, we have specifically 


AP = (2 — Ty — HBO ll (D9e?; 
Zsen = (2 — Ty — Bi loon \ La (61a) 
with 
(L2)y = 2[B2" Iw; (Le) sin = [B27 Jain. (6 1b) 


Here [B{”], is the first iterated kernel for G(t, 7)y, etc. 
Correction terms to the normal distribution densities 
may be found as before with the aid of (188), (189), 
Appendix II-C. A similar development of the eigenvalue 
forms in (57a), (57b) using (149a) also results in (60) 
and (61), as the reader can readily verify for himself. 

In the threshold situation, we get approximately for 
the minimum error probabilities 


atkm~ 4{] ee O(z ATOM. 
~ 41+ oe? /V2)}, (62) 
and with the aid of (55) and (59) we find that now 
250) = [Jog (3/0) — B{ BI 
e(-1)" ,. a 
te oy cs Ao ae (1B oy ise) (63a) 
m=1 
@y a = ct Ne ap 
27 = \log (K/u) — 9 d (Boe 
“ABP, (63b) 


provided (a2 < 1); [the strict condition for convergence 
is given in (55)]. The Bayes risk then follows from (7). 

This completes our present exposition of the general 
case of colored noise signals in colored noise backgrounds. 
We shall apply this in the succeeding sections to the 
important class of problems where the noise accompanying 
the signal is white. Explicit determination of the iterated 


kernels [B‘”], and the eigenvalues [\{”]p, etc., are 


m 


reserved for a later study. 


V. Optimum THRESHOLD RECEPTION IN STATIONARY 
WuitE NoiIsé (CONTINUOUS SAMPLING) 


A. Detector Structure 


Generally, the most common situation in communi- 
cations practice occurs when the noise background is 
supplied for the most part by the shot and thermal 
noise of the receiver itself, and so is essentially white, 
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from the spectral point of view, and normal statistically 
Stationarity is also a valid assumption in most instances 
as long as the observation period (0, 7’) is not excessively 
long. In examining the important cases of threshol 
reception, using the trace method where continuou 
sampling procedures are employed, we shall begin with 
detector structure. 

Here it is convenient to let py'¢(t) = e7(f), (0 < t < T) 
in (51b), and observe for stationary white noise back 
grounds that 


_ Won 


9 (64 


where Wow is the spectral density of this noise process 
so that the solution of the first integral (51a) is at once 
a(t) = (2/Wov) V(t), (0 < t < T). The second integra 


(51b) becomes accordingly 


i 


wu 


E (t,u) + To (uu — » feat du 


(O<¢<T), 6g 


whose solution, when the signal process is stationary and 
has rational spectra, is given in general terms by (231b) 
with (237)-(239). The resulting z7(4), (0 < ¢ < T), how 
ever, is a linear functional of the received data V(é), 
and although systems can be built to compute the structure 
term ®,, (52a), v22z., 


fi 


br = | Ver dt, (66) 
0 
it is more convenient from the point of view of system 
design, as we shall see presently, to obtain an alternative 
form of solution that is independent of the received wave. 
This is easily done, if we return for the moment to the 
matrix forms (43)-(49) characteristic of discrete sampling. 
We now regard Wy as the mean intensity WoyB of band- 
limited white noise, letting B — © at a suitable stage of 
the analysis, so as to give us once again the white noise 
background of our present problem. Now for this band- 
limited white noise we observe that its covariance function 
is 
sin 27B(t, — ts) 
2nrB(t, — t.) ’ 


Ky(|  — & |) = ow (67) 
so that if data is sampled at the times ¢; = j/2B(j = 1, 
, n) at which Ky(| t; — ¢,|) is zero (¢ ¥ 7), we have 

Ky = wyI. Consequently, (48) becomes 
= py lV. (68) 


Letting (WyC);; = pr(t,, t;) At, [ef. (259)], and using (47) 
in (49) we get finally 


Ky = (Ay + Kg)yC, or (69a) 
(Ks) «x = > | Hor 2 6; ; + 6), font t,) At, (69b) 


V5? 


here we have made use of the fact that 2BT = n, as 


} —=s Ge anal 


W. 
by = WovB; lim yy = lim Tis 


assing to the limit then yields the desired integral 
uation in pr, v722Z, 


iu 


7 
t) Jerr, u) du 


OS = 2), 


| Katt u) + Won d(u — 


i Ks(t, 5 (70) 


hich is one form of a somewhat more general relation 
ntained by Price.“* (For further details, see Appendix 
V-H.) Since Z = WyCV, (47), the continuous version is 
ide (262)], 
aos 

Zi) = [ Vierorte, ¢) ae, (71) 
here pr(t, t’) = prt, t), (261). From (263) and (268) 
e find that the solutions of the two integral equations 
5) and (70) are related through (71), where now’® 


Z(t) = = ne 


Zr(t) (ORS rs. 00), (72) 
id that the structure factor @7 can be written, alter- 


atively to (66), as 


Pr = mal Viti er(h, to) V(t) dt, dts. (73) 


Won 


ere pr is indeed independent of the data V(t). Solutions 
' (70) may be obtained by the methods of Appendix 
V-A; as noted therein for the particular kernels involved 
are, we can find the solution at once by inspection of 
31b), with the aid of (71) and (72). The results for 
neral, rational signal spectra are given by (269). In the 
yecial cases or RC and LRC-noise signals (Appendix 
I-D and III-E) explicit solutions follow at once from 
ppendix IV-C and D as applied to (269), et seq. 

While ©; can always be computed directly for a par- 
cular V(t), it is possible to put it in a form that is easier 
r the system designer to work with. There are many 
ays of setting up ®7, (66), (73), for this purpose, two 
which we shall describe below briefly. We note first 
at since the integrand of (73) is symmetric, vide (261), 
e may make use of the following relation, as employed 
iginally by Price in a similar connection,”° 


xr y 
[ A(a, y) dx dy = 2 f ae | A(x, y) dy, 


A(z, y) = Ay, x) = (74) 


14 Bibliography [4], Part I, (1.54). 

15 The relationship (72) may also be established if we observe 
ut dZ'/dt = z7(t), where Z’ = CV = Zy,,1, upon suitable passage 
the limit. 

16 Bibliography [4], (29). 
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to write 


T t 
ome 7 / V(t) dé i Vpn t= ae 
ON JO 0 
Since we may set pr(t, rT) = pr(t — 7, #), etc. equal to 
zero for all ¢ — 7 < 0 without affecting ®;, we can in- 
terpret pr(¢ — 7, t) as the weighting function (7.e., impulse 
response for a unit impulse applied at time #) of a physically 
realizable, time-varying, linear filter. Thus, 


a(t) = fo V(r)pr(t — 1, t) dr 


is the output of this filter at time ¢, and 7 can be re- 
solved into a sequence of realizable linear and nonlinear 
operations, as indicated in Fig. 1. The data V(t) 


Time-vorying lineor filter 


Multiplier (non-linear device) 


v(t) 
a ideal integrotor Adder 


t 
nput Thane 
i 12 & fee 
a fos "i 
—> 
Bios [eios ro] 


Fig. 1—Schematic diagram of one possible optimum receiver opera- 
tion for the detection of a (normal) noise-signal in white, normal 
noise. 


Threshold 
comparator: 


log K 


L=--0(N) Ho 


Decision 


is passed into the time-varying filter, and then its out- 
put for all ¢ is multiplied by the original input data 
V(t). The product is next passed through an ideal filter 
[weighting 1/Woy in (0, 7)] to give us a number, (4)/7, 
at the end of the interval (0, 7’). Combined with the bias 
I, the result is log A7, which is next compared with the 
present threshold log ®, (6), and a decision is made: 
“sional, as well as noise,” if log Ay > log K, or “noise 
alone,” if log Ar < log ®. Observe here that multipli- 
cation is a nonlinear operation. 

This representation of ®; in terms of a time-varying 
filter, multiplier, and integrator, however, is not unique. 
If we impose again the condition that our linear filter be 
made up of physically realizable networks, and add now 
the further constraint that the nonlinear operation 
(analogous to multiplication above) be performed by a 
zero-memory nonlinear element—here, a full-wave quad- 
ratic rectifier—we find (Appendix V) that @, can be 
computed in terms of a “‘matched,” linear, time-invariant 
filter,*’ followed by this zero-memory square-law detection, 


Ideal integrator 


Optimum "matched" filter 


Zero-memory 5 
full-wove v(t) 
> 

quadratic 

detector 


® 
att /2 
We he mone (08 | 


Fig. 2—Alternative system for the optimum detection of normal 
noise in white noise backgrounds. 


and the same ideal integrator, see Fig. 2. We define a 
matched filter here as one that reduces aN subsequent 
nonlinear operation for ®, to a zero-memory operation 


17This is a generalization of the matched predetection filter 
introduced by Van Vleck and Middleton [8] in their study of the 
reception of pulsed radar. 
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(Appendix V), with minimum average cost of decision. Thus 
®, becomes, alternative to (66), (73), and (75) 


(76) 
where V(t) is the output in the interval (0, 7). 


5 . 
Ve) = [ Vent — ed’, OSt<ST), (7 
70 

of the matched filter, when the input is switched on at 
t = O and off att = T. Here h,,,. (¢ — U’)z is the weighting 
function of this physically realizable, optimum predetec- 
tion filter, whose structure is determined from the solution 
of the nonlinear integral (279), namely, 


Bi 
prt, T) pepe. = il Reed oe t) move. (& ee T)R dx, 
0 


(OES 7 = Be (78) 


This solution follows directly upon a double Fourier 
transform of both sides of (78), if we note that p(t, 7) r-opt- 
vanishes outside the square (0 < #, 7 < T), (70)—as 
a consequence of the condition that only operation on 
data in (0, 7’) can here influence the simple binary decision 
process, at the end of this interval. The modulus of 
the system function of this optimum, matched filter is 
given generally by (cf. Appendix V) 


| Y oot. 10) p | 


fh 1/2 
= vl ff pats pee ce Sabu a ; (79a) 
a J 


ive 


= aa pr(t, T)r-opt. COS w(t — 7) dt dr | (79b) 
0 J 3 


An infinite number of matched filters is seen to be avail- 
able, with weighting and system functions subject to 
(79). This is the result of fact that the observation process 
is incoherent, so that a@ priori phase information about 
the input is lacking. We cannot, therefore, expect to 
specify our predetector filter any more precisely, as the 
analysis shows. We have, in effect, an additional degree 
of freedom from the design viewpoint, since we are now 
at liberty to select those phase responses, $,,¢.(¢w)r, In 
Y opt-(2@) ze = | Yope- (4) x | EXD [2bont- (dw) 2], which are most 
convenient to the particular system at hand. 

We remark, finally, that @, can be expressed in still 
other, more complicated ways. These representations 
are not unique, although our two examples above appear 
to be the simplest choices available. In any case, the 
decision as to what structure should be used and what 
elements chosen is left to the system designer’s discretion, 
and will depend on how thoroughly and how cheaply he 
wishes to approach an ideal design. We shall refer to this 
point again, briefly in Section VIII, following, when we 
consider the question of suboptimum systems. 
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B. The Bias T> 

For continuous sampling in (0, 7’), with a white nois 
background and threshold observation, the bias (199 
becomes" 


© Wie =| ™ re 
Ts logue 
m=1 m 


NCE eB ele (80, 


when [B‘”], is the iterated kernel, order m, for ks(¢, wu) 
(195) and o2 = Ws/Wonwr, where wr is an angular fre 
quency measure of the spectral width of the signal process 
and \(= w 7’) is a measure of the duration of the obser 
vation period, in terms of the bandwidth of the signal 
Thus, o2 is an effective input signal-to-noise power ratio 


d — Wpl’, 
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Fig. 3—(a) Bias (1%) as a function of observation time and input 
signal energy for broad-band (RC) and narrow-band (LRC) 
noise signals in white background noise—continuous sampling 
(b) Bias (TM) as a function of observation time and input 
signal energy for broad-band (RC) and narrow-band (LRC) 
noise signals in white background noise-continuous sampling. 


18 An exact expression for To, all 2, is given by log uw 


o m =1/2 
+ tox FT {1 1° es) ni” is} 


on comparing (199) with (161). Here 2~s/Woy = 2c.%»p, and the 
[\;(7)]s are the eigenvalues of (152a), when G(t, 7) therein is replacec 
by K, (lt — 71). For solutions in the case of RC-noise signals, see 
Table I and Price’s explicit calculations of eigenvalues; Bibliography 
[4], Part II; also Report 30-40, p. 15. 
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1ere Woywr = (Wy) is proportional to the average noise which has the formal advantage of the indicated symmetry 
wer passed by the predetector, matched filter. Here with respect to K/u. Here R*(o) is the Bayes risk for 
de Appendix III and Fig. 3) the results are expressed in infinite input signal-to-noise ratios, while ® is the irre- 
rms of the total input signal-to-noise energy oj; =  ducible average risk, and R*(0) is qC,_4 + pCg if K/u <1, 
T/W on accepted in (0, 7’). and is gC, + pC,-3 when K/u > 1. 

Fig. 3, p. 96, shows the bias (actually log 1 — Io) fora By specializing the results of Section IV to white noise 
de range of values of input signal-to-noise ratio and backgrounds, and in particular with the help of (62)- 
tegration times for RC-noise signal (7.e., a broad-band (63) and the various expressions for the iterated kernels 
mal), and the case of high-Q, (7.e., narrow-band) LRC- in Appendix III, we see again for threshold reception that 
ise signals. (Note that (wr)r0 ¥ (wr) rec, (208c) et. seq). iat (@74/5I) 

1e calculations are based on approximations of the iterated sari ar alent SLAY 
rnels [B{”]s, as described in Appendix III-D and III-E Be ~ 11 + Oe /vV/2}}, (83) 


th r] bias t iven 1 : ; : F 
11a), and (2116), for a number of extreme conditions, WHere NOW specifically for the RO- and high-Q, LRC-nois 
; ; ; ~~" signals of Appendix III-D and III-E, we have 


ne oy 128 (K/u) + Q/2)[VWI-+ 4e% — 1] — Ao + 402)? | 
o2-V 2x/(1 + 402)" 


~, log (K/u) + A/2)[VW1 + 402 — 1] — oid 
a, A/D) 
For the usual situation of weak signals («? « 1) and long 
oproximate as these results are, they are remarkably integration times (A > 1) these reduce readily to the 


se to the results obtained by using the exact develop- simpler expressions 
ent in terms of eigenvalues.’® Comparison with Price’s 


ce 0 eit: (84a) 


er Vine d = Arc = (wr)rcl'. (84b) 


sults'® indicates that for strong-signals [the most log () 

favorable case, since the series (80) is then poorly CO\ze= AL - eB el re) . 
nvergent, if at all] our approximations are about M2 as NX 

per cent too large (¢5 = 20 db), while for the longer x 

tegration times and weaker signals, they err by 0.1 1 log () eS 

‘ A 5 , (B) age ld ¥ 2 Avan (85) 
r cent or less in many instances. We conclude, then, (er )re V2 aR Oe . 
Te 


at certainly in the threshold situations our approximate 
satment, using the trace-method, gives satisfactory When these relations are inserted into (83), we have the 


sults if not too high a degree of accuracy is required. continuous analogs of earlier results for a simple optimum, 

: pulsed radar.*” As expected for threshold reception, the 

Bayes Risk and Minimum Detectable Signal Bayes risk (82) is a simple function of o? ~/ under these 
The Bayes risk is given by (7), a convenient form of conditions. 

1ich is the normalized expression For the narrow-band, LRC-noise signal, Appendix 


IU-E. we get similar results:”* 


~, log (K/m) + A/AIVI + 402 — 1) — (A/2)0%1 + 402)? (log (K/u) | of ais 6 
a GN, Mle ( nN aD i862) 


jog T+ (V1 a Lee as fy 


Gr LRO 


R* — Go K (B) 
* ee = * * 2 = _ 
LN p(Cz _ Cx) U a iis B ) ( T )ire Oy 
Ro = GCizs =~ pCras (81) (=) 
log = 2 -~ 
1other normalization is [9] B aan ES ee oN Ne (86b) 
Ree oV 2 J 
1 igi Coe rae K 
7 R*(0) — R¥(~) su ES A m a. 99 where now \ = Azpro = (wr)zrcl’. The correction terms 
ee to the error probabilities (189), (189a), indicate that d 
= ge = (eae ae enn} should be 0(10°), or more, for reasonable accuracy. 
v 


Curves of normalized Bayes risk are shown in Fig. 4. 
19 In certain special cases, for example, when the kernel of the They exhibit the characteristic falling-off to zero as }, 
egral equation has the form e-@!+!, it is possible to obtain closed 

ms for the bias, without having to calculate eigenvalues. See i 

particular, Price, Bibliography [4], PGIT paper, footnote 26. 20 Bibliography [1], Example 1, (3.9). ; 
e method is due to Siegert [19]. (This is, however, not possible 21 We replace Arc in (84a), (84b), and (85) by \zrc/2, when og, is 
the error probabilities. ) used, or by 2\zrzc when oo appears; (213), also (210), (211). 


« 


‘ Cg—Cig8 
H=P/q 
A=weT 


og = Ys Won 
= CEI 
93 Yu rc 


10) 1.0 2.0 5.0 6.0 70 


2? U5 4.0 


Fig. 4—Normalized Bayes risk Rev* asa function of o2 ./d for 
broad-band (RC) and narrow-band (LRC) noise signals in 
a white noise background-threshold reception and continuous 
sampling. 


or o., becomes indefinitely large. Depending on the value 
of log K/u, (2 0), these curves will lie above or below 
that for (K/u = 1). 

Besides the Bayes risk and the error probabilities 
themselves, a quantity of related interest is the minimum 
detectable signal, defined as that value of input signal 
(power) which will yield a certain (Bayes) average risk 
at the output of the decision system when the background 
noise level is specified and a threshold ® is chosen. Here 
we shall define (a5):, [or for continuous sampling, 
(o2) min} aS that value of the input which, on average 
performance, yields some specified portion (n) of the 
maximum, normalized Bayes risk @&xX,. The fractions 
chosen here are 7 = 0.1, 0.01, and 0.001. Thus, from the 
calculations of R*, we obtain directly 


Mal <7, (87) 


where M, is a number determined from the value of 
(02) min-n WA in R* that yields 7R*(0) = 7R*naxj (the 
maximum average risk occurs, of course, for o. = 0.) 
The significant feature of this result is that (¢2)nin is 
inversely proportional to the square-root of the observation 
time, for threshold reception, as the curves of Fig. 4 show. 
[For very strong signals, on the other hand, as the results 
of Section VI indicate, (02)mia-7 —  , Characteristic 
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of all such cases, whether coherent or not, when the 
background noise is effectively suppressed.] It should be 
emphasized that there is nothing particularly advan- 
tageous in defining minimum detectable signal in terms 
of this particular, normalized Bayes risk (82) apart from 
a certain general analytical convenience. Other normali- 
zations, or none at all, may serve as well, with (0%) min 
defined similarly. The procedure here, like that of system 
design, is left open for the particular application at hand.” 


D. Minimax Detection 


So far we have assumed that the a prior? probabilities 
(p, q) that a data sample V(t) does, or does not, contain 
a signal are known. In many cases, however, (p, q) are 
not known to the observer, and it is then a natural 
question to ask how one would operate an optimum 
receiver when this information is unavailable. One way 
of protecting ourselves against ignorance is to seek the 
‘“worst-best”’ operation, or more technically, a Minimax 
system.” Although Minimax operation can be over- 
conservative, it does guard against the worst possibilities, 
on the average. Let us, accordingly, find the Minimax 
system for the present problem of optimum threshold 
detection. 

We make use now of the fact that a Minimax system 
is one for which the maximum conditional risk** is never 
greater than the maximum conditional risks of other 
systems, for all possible signals. We use this fact”? and 
(83) for these weak signal cases to write finally as our 
Minimax relation the Minimax probabilities 


= Cres Ce 
CG, =O . 


= O*/VD) + Koes*/V2, (88) 


which determines pj and q*. When this has solutions” 
wit (= px /qi) is determined, with p# = 1 — q¥, of course. 

As an example, let us take C,_. = Ci_s = 0, and lew 
us set K = 1,.7° Then, C, = -€,(~> 0), and irome(ss ae 
get directly, for both the RC and LRC cases above, | 


1—a 


(89), 
The Minimax system is still our Bayes system with pf = — 


1 
du = 


log wiz = 0. 


1/2 now. Thus, when p and q are unknown to the 
observer, and he sets them equal, so that log » = 0, he 
is operating a Minimax detection system. He is guarding 
against the worst case, 7.e., the most expensive errors, as _ 
is the case with Price’s system [4] under these conditions. 
Of course, when p = 1 — q is available to the observer, _ 
then he should make full use of this information, as done” 
above. The Minimax average risk Rx always exceeds’ 
(or at best, is equal to) the Bayes risk R*, and so by 
comparing R#y, (R#y) min-max here, we can form some idea 


#2 Bibliography [1], (3.6). 

°° For a detailed discussion, see Bibliography [1], (2.10) and for 
detection, in particular, Bibliography [1], (3.4). | 

*4 Bibliography [1], (8.4). 

*> Bibliography [1], (3.9), and (8.20). 

6 The ideal observer condition, Bibliography [1], (3.3). 
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the price we must pay for our ignorance of these a 
‘ort probabilities. 

The Minimax average risk in our present condition of 
ntinuous sampling and threshold reception is determined 
once from (89) [K = 1, here] in (83) et seg. We have 


af = 4{1 — O(2/vV/2)} = Bk, (90) 


Bete = tno = (o,)eo-N/ Neg /2, OF-& = fine = (62) tee 
/\iexc/2 for our RC- or (high-Q) LRC-noise signals, un- 
sr the special cost assignments introduced above. The 
linimax average risk is 


m/Cs = ak = p* 
4{1 — O(a/V2)}, (91) 


typical Minimax average risk curve is given in Fig. 4 
rk = 1, and uw = 1. Minimum detectable signals in the 
[inimax situation are determined exactly as in the Bayes 
uses above. We have n(Ri/Cg) max = 1/2, where (0 << 1) 


Ey Thy ets IA 


the fractional level at which (7) (i)_, is to be calculated, 
» that here 

Geen 26) (=n) \/ Name, or | (92) 

= 024/207 (1.5) V Nine, (93) 


‘spectively for the broad-band RC-noise signal and for 
ve narrow-band LRC-noise signal. Similar calculations 
r other cost assignments may be effected in the same 
ay, with the aid of (88). 


I. A Sprcrau Case: DiscreTH, INDEPENDENT SAMPLING 


A problem of some practical interest arises when the 
umpling process is made discrete, with intervals between 
impling times that are much longer than the average 
uctuation time of the signal and background noise 
rocesses. Then both Ks and Ky are essentially diagonal 
atrices, ¢.g., Ks = WsI, Ky = WylI, and we have a case 
‘random sampling. From the analytical point of view 
us is interesting, since it is a situation where com- 
aratively simple exact expressions for the bias and error 
robabilities can be found, and since it represents one 
miting form of data processing, as opposed to the other 
‘treme of continuous sampling. Thus, by comparing 
ayes risks, or minimum detectable signals, for a common 
1reshold in the two cases, we can gain some idea of what 
e lose in system sensitivity when the information 
herent in the correlated sample [7.e., the continuous 
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Optimum structure, bias, and average performance 
are now easily determined. From (19a) we see that 


C = ovI — (vs + py) I 


=i a ; iy Se j 
Vy (= ae aE a = Vs/ pn. (94) 
The structure factor ®,, (22), is now 
-igy(__40 
while the bias, Io, of (23) reduces to 
Tl, = logy» — 5 log (1 + ag). (96) 


The error probabilities follow directly from (32a), (32b), 
and (36). Thus, for the optimum detector embodied in 
(23), we have now 

= 7 Maree! (ta) 
log-A, = log z 9 log (1 + a5) + a\i+ a vv; 798) 
we find first that the distribution densities of log A, 
become [(32a), (32b), and (94), ete.], with By = aj/(a, + 1), 


=e (a — io) 


my eC dé 
Qa(z) = fs (Cee ae 


o Bs’’T(n/2) (ae — Paes 
se PT )/Bo (nS P,) 


== (0), (eo =<) (98) 
and 
=n. WL 2) Celene te 
pe TEs UR Creal) 
= 0, (G@r< Io) | 99) 


These are recognized as x’ densities, with n degrees of 
freedom [10]. Applying (98) and (99) to (9), we see that 
the error probabilities a*, * may be expressed in terms of 
the incomplete Gamma function 


1.4, y) = r@” [ IE Le 


0 


tia in (0, T)] is thrown away. Rely) > 0, Ca es ie) (100) 
TABLE I 

Gay = Os We 22 1) a > © (n > 1) ae > 0; (n> &) 

| = 

| K * * 

K/u log - a Bt Rin a B* at B Re 

) in 
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nn nn 
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We have specifically 


a* =] — ye Gel) i 2| (10 1a) 
By 
B= Talga lo Cicer et. (101b) 
and the Bayes risk ®*,, (81), is simply 
Rly =“ a* + B* (102) 


where the J, in a*, 8* are tabulated functions [11]. Values 
of a*, B*, and ®* are shown in Table I, p. 99. 
Specifically, the arguments of J, in (101) are respectively 


loz (K/T)) _ nh (! + y p 
B, = e log (1 + ao) 
+ (#44) log =| (108a) 
Qo Le 
as? log (x/T,) = 2 RMF oo 
A 
ieee (103b) 


In the threshold cases, we may employ the techniques of 
Appendix II-C, to get the characteristic normal distri- 
bution densities for P,, Q, (with correction terms in the 
Edgeworth series). The error probabilities (101) are now 
approximately 


OS at OG) se 10. ee On, 

Mlle ai S10 large n, (104) 

where 

a 2 

py NA) ours 1) i a» } 

Zo wD 2 ( a log (1 =e Qo) 1 
a , 

+ log x / avn =f 4% vin ,  (105a) 
ul (ao + 1) A Vn as 
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Fig. 5—Normalized Bayes risk Rry* as a function of ao2+/n; thresh- 
old reception and discrete random sampling of colored noise 
signals in band-limited or colored noise. 
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Fig. 6—Normalized Bayes risk Rrv* as a function of input signal- 
to-noise ratio do2. Discrete random sampling of colored noise — 
signals in band-limited or colored noise. 
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vide (85) and (86). Curves of Bayes risk and minimum — 
detectable signals are shown in Figs. 5-7, based on (101) | 
and (104), (105). Again observe the characteristic de-_ 
pendence of (@%)min-, on n ’” for threshold detection. For 
large a2, we get essentially (@2)nin-» — 2, aS expected, | 
while for intermediate values of the input signal-to-noise 
ratio the minimum detectable signal varies as 7°, 
(2. eer 


VII. ALTERNATIVE APPROACH FOR NARROW-BAND 
SIGNALS 


A. Preliminary Remarks 


When the signal process is narrow-band we can obtain 
an alternative representation of optimum (and sub-— 
optimum) detection in terms of the slowly-varying - 
components S,, S, of the possible signal S, where now 


S(t) = S,(f) cos wot + S, sin wol, Wy) = 2rfo, (106) 
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Fig. 7—Minimum detectable signals for the normalized Bayes 
risk Ryy. Discrete random sampling of colored noise-signals 
in band-limited or colored noise. 
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nd jf is some characteristic frequency, usually the 
entral one in the signal spectrum, if it is symmetrical. 
‘or this discussion we shall assume that S(¢) does indeed 
yossess a Symmetrical intensity spectrum. The back- 
round noise process N (¢) can also be represented by 


N(t) = N,(t) cos wot + N,(é) sin wot, (107) 


vhere now N(¢) need not be narrow-band, although, of 
ourse, N, and N, are then no longer slowly-varying 
compared to sin wolf, COS wol. 

Let us examine the statistics of NV. and N,, when N 
s normal. These components are also normal, (No = 


V. = N = 0), with covariance functions 
V -(t;) N (te) = N.(t,)N (te) oe Wy Pol | bay !) (108a) 
V .(t;) N o( te) =a SV Gr yIN ty) 

= Pydro(t; — &); Aol—t) = —Ao(Z), (108b) 


vhere, in particular, (¢ = ¢; — ¢,) 
=f wo(s) cose — aotds / [ veo(H af, (4002) 


v= f Ww(f) sin @w — wo)t ar /[ Warf) df, (09b) 


ind W(f) is the intensity spectrum of N(é). Now in the 
‘ase of narrow-band noise we get the well-known result 
hat, approximately 


mh == Qn A Wwf’) cos wt df’; Nee ==" Oe CLO} 


mrovided that Wy(f) is symmetrical about fo, fo > Af, 
he bandwidth of N(¢). Then N, and JN, are essentially 
tatistically independent, for all ¢ = (¢; — 4), with a 
rreat simplification in the subsequent analysis. 

For our present problem of white normal noise back- 
rounds, on the other hand, N(é) is, of course, not narrow- 
yand, and moreover we do not expect that N,(/), N,(¢) 
vill be statistically independent for arbitrary noise 
yandwidths. However, in the limiting situation of white 
10ise, as we shall show below, NV, and N,, though rapidly 
rarying, are independent, and we can make use of this 
act to obtain (from the analytical point of view) a simpler 
ystem for detecting narrow-band signals than the general 
ne described in Section V. 

We start with band-limited white noise, for which 
wt) = Wow, 0 < f < B, Wy = WonB. Inserting this 
nto (109a), (109b) we get 


po(t) = IrBt 5} 
COS Mot — cos 2r(B — bee 
hold) = SOS eet — cos PnlB — fol ai) 


Jow observe that for ¢ ¥ 0, lim B > o, both po(f), Ao(Z) 
anish, while if ¢ = 0, we get 


lim (0) = 1, 


lim 
/ 


lim \,(0) = 0. (112) 
Brow 
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From this, and the normal character of N, and N,, it 
follows that N, and N, are completely independent, 
(108a), (108b). Thus, we can use (107) to represent 
white noise (in the limit B > o), and the “components” 
N, and N, are statistically unrelated for all fo, t. 

For \) ¥ 0, the distribution density of the n sampled 
values of NV, and N, is 


exp) aka Keekay 


WeN N,)y aed (2m)"(det Ky) ) 


X) = *) Xo =10,0 NS Nees) 
N, 
and 
joes vs] @0 | 
—Ho 00 
where 0 = [po(l 4 — & |) = 0 (114) 


wo = [No(t; — ti) | = =o. 


Writing kj = Ky wy’, we see that det Ky = py det 
(05 + 23). However, for A, = 0 we get 


W2AN., N,)y = WAN.) vW.AN,) 


exp {—4N,(Yxoo) 'N.} 
(2m)? Vdet Woo 


exp {—3N. (Wyo) 'N.} 


(Qn)? det Yvon” 


where these relations are appropriate to white noise in 
the limit B — «. We can, nevertheless, use (113) as the 
distribution density of a set of noise components NV, N,, 
which become identical with the components of our actual 
white noise background in the limit. In fact, if B is at all 
large compared to the reciprocal of the sampling time 
6(= tear — t,), (115) is an excellent approximation. We 
observe, finally, that since V., N, (115) are independent 
(all ¢), and since it is assumed that signal and noise are 
also independent, we may write for the received wave 


(yw < ), (115) 


Vit) = V.(4) cos wot + V,(4) sin wot (116) 


with the properties that for V, = N,+ S,,V,=N.+S, 
(if there is a signal), 


Vacs Vere = (We (117) 


provided 8.,S,, = 0, 7.e., provided the spectrum of the 
signal is symmetrical. In this fashion V,, V,, for an 
additive signal, as well as noise, are statistically unrelated. 


B. Structure of the Detector 


The analysis of Section II-B is now applied to V., V., 
instead of V alone. The optimum detector log A,(V) is 
replaced by 


Loe ACV 50 Va) 


= log K fe log 1 Wen Viss V.) s2n/WolV.; V.)v} (118) 
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using the Hilbert transform of V to write (approximately) 


V(t) = V(é) cos wot + KR(V) sin wol; 


where it is somewhat more direct to use the alternate 
approach implied by (18). For noise alone (W>,)y is just 
(115), with N,, N, replaced by V,, V,, while 


at ae V,(é) = V(é) sin wot — 3C(V) Cos wot (123) 
es i where 
= exp {—V.(Y0o ate K;,) V; 
_ ei. Sake ve 5c(V) = =i Aw dt; 5C(V) = —V (124) 


; (2m) "{det (Yo =f Ky, 


since signal and noise are additive and independent. 
Here Ks, = [Wspo(t; — t)s] is the slowly-varying part 
of the signal processes’ covariance matrix, (109) with 
Wy(f) replaced by W5(f). Writing 


119 ; 
My) and & denotes the principal part of the integral. There- 


fore, given V we can in principle calculate its Hilbert 
transform according to (124), and from (123) thus obtain | 
the desired components V, and V,. | 

In practice, of course, this is almost always out of the 


Ky, = Wy 00 d 20) 


for the background noise, and applying (119) to (118), 
we get the alternative form of the optimum receiver 
structure in terms of the received data ‘‘components,” 
V. and V,, and the slowly-varying part of the signal’s 
covariance function. The result is 


log A,(V., V.) = log uw — log det I + Ks,Ky,) 
PF IV ECVE Se EV CWVe 


where C, = Ky! — (Ks, + Ky,)~’, see (19a). 

This expression is valid for discrete sampling where 
6 >> B™ and becomes precise in the limit B >  (T 
finite) of continuous sampling. From Section V we find 
that under these circumstances (121) reduces to 


log A7(V.(b), V(t) = is 
+ 46,(V.(t)) + 48,(V.(0) 


where I’, is given by (201b) and @7 may be found from (66)’ 
(73), (75), or (76), if we replace V(¢) therein respectively 
by V.(é) and V,(¢). The actual operation of the detector 
is similar to that indicated in Figs. 1 or 2, except that a 
new bias, I%, is used, and the ®, for both components 
are added together at A’, before this bias is inserted. 
The same threshold, &, is used, following A (Fig. 1).”7 
The direct, or ‘broad-band’ approach, described in 
Sections II-V, and the narrow-band or ‘“‘componential”’ 
approach considered here, give the same performance 
when the thresholds are identical, as long as the input 
signal is a narrow-band process (observed in white noise). 
The simple structure of (122) breaks down when wide- 
band signal processes are included. 

This system (122) operates on the “components” V,, 
V, of the received wave V. However, V is certainly 
broad-band, since the accompanying noise is white, and 
the question arises as to how we obtain these components 
V., V, analytically and physically so that they may be 
applied to our decision-making device. From the purely 
mathematical viewpoint, we can answer this question 


(121) 


(122) 


27 Note that Ip’, computed from (201b) for the high-Q, LRC 
noise signal (for which Ks, (t)~ e-e rt, formally identical with 
the RC case), is equivalently given by (192), (21la), and (211b), 
when (210) is employed. 


question, but if the noise background is spectrally very 
broad compared to the signal, 7.e., B >> Af,, yet narrow 
compared to the central frequency fo (> B), V, and V, 
are slowly-varying compared tO COS wof, sIN wot, and may 
be obtained from the received wave V(t) by modulating | 
it respectively by cos wof and sin w ¢ and putting the 
result through a filter (of gain 2), whose upper cutoff 
frequency is less than 2f, — Bo. This is a simple way of 
getting the desired components for use in (122), and at 
the same time is a reasonable practical approximation to 
the ideal situation and a very good approximation if B 
is much greater than signal bandwidth (as well as small 
compared to fo). 


C, Error Probabilities 


Following the steps of Section III-A, we may write now 


fos) logK 7 
at = / =. Q.(y) dy; __B* =f Py) dy (125) 


where 
Q.(y) = fl av. AV.F(V., V. | 0 
“dy — log A,[V., V.]) (126 a) 
P.{y) = il dV, dV.(F(V., V. | S., S;))s : 
. <O( — Toe RAZ Vea Veal Ee — 


Using (121) with (119) and (120) and the fact that F',(2é)y, 
F’,(2&) 5.7 follow from expressions like (28), (30), we get 
finally 


F(ié)y = e**"* {det (I — tty, pyCo)} 
Fé) gs. — 


(127a) 
To" (det I — itby(aks, + ky,)Co]}? (127b) 


where ky, = 9) and ks, = y% Ksg,. The significant feature 
about these relations, compared with those obtained by 
the “broad-band” approach, (31), is that now the de-. 
terminant appears to the first power (in the denominator) ; 
so that we may expect the exact treatment using the 
eigenvalue method to be effective here. The Fourier 
transform of these characteristic functions are calculated 


957 
1 (183) and (184), yielding P,,(y), Q,(y). The error prob- 
bilities then follow according to (125). To illustrate 
ere, we have, for only positive eigenvalues: 


N+ n+ (ik) 


* >> e7 (loek-To')/de ow I] (1 = NOT NO 


k=1 7=1 


log K — To => 0, 


(128a) 
n+ n+(j#k) 
= ell = xMpdr, 
k=1 7 
log K — Th < 0 
nd 
* 5 —— Boe ORES EN 
ai ] 
n+(j#k) 
I] G@ — NAdsiw, 
log K — IG > 0 
— at) ,log K —T < 0 (128b) 


‘rom the preceding analysis we note that (Aj) v, (Af) s+" are 
he eigenvalues of Gy, G4», 7.e., the coefficients of 
-7£ in the determinants (127a), (127b) above, see (153). 
‘or continuous sampling in white noise, (71 +) > ©, 
nd the eigenvalues are found from the appropriate 
itegral equations (152a).** In the threshold cases we can 
lso use the trace-method, as was done in Section V, 
ithout having to calculate these eigenvalues directly. 


VIII. SuBsorprimuM Systems 


The methods of decision theory are equally applicable 
0 specified systems which are not optimum. By computing 
he average risk FR, (3), in such cases, one can then compare 
he given (suboptimum) receiver with the optimum one 
or the same purpose, and in this way obtain a definite 
yeasure of the extent to which actual performance is 
egraded for the chosen criterion.*” Here we shall briefly 
utline the analysis of a class of nonideal detection 
ystems for operation in white noise backgrounds, to indi- 
ate how such comparisons might be made. This is a 
roblem of considerable practical importance, since it 
; almost never possible to construct an exactly optimum 
ystem. 

We let log L,(V) be a general square-law system, since 
here is little point here in introducing a more complicated 
tructure when it is known that the quadratic device 
ith appropriate weighing is optimum, [see (21) and 
23)]. Accordingly, let us write 


log L(V) = G + 3VHV = G, + WY, 


vhere Gy is the bias and H is the weighting, which in 
eneral is not equal to C for optimum receivers; G) may 
ot be set equal to I), and usually we may wish to choose 
ome G, ~ I, to help compensate for the fact that H # C. 


(129) 


28 Bibliography [4], Paper No. II for some explicit results in the 
ase of the high-Q, LRC noise signal. 
29 Bibliography [1], (3) and (3.6). 


Middleton: Stochastic Signals in Additive Normal Noise—Part I 


103 


Here W,, like @, in (23), is the structure factor of the 
detector. Next, we let 


H =r Dike (130) 


where now D, is specified a@ priori in some manner and 
embodies the various filtering operations of the system, 
as we shall see presently. (For the moment the background 
noise is regarded as having a finite yy — here as band- 
limited white noise. At the appropriate point we shall let 
B— o and-use limg.. Wy = lim... nWon/2T m the 
analysis to obtain the desired expressions for continuous 
sampling in white noise backgrounds.) Again, unless 
D, = I — (K.Ky' + I)’ =I — (D+), (19a), our 
receiver is not optimum, 7.e., H + C. 

Although this is not a unique representation as far as 
WV, is concerned (vide the discussion in Section V-A, for 
optimum systems), let us postulate that our receiver 
(except for a switch at ¢ = 0, T) has a linear, time-in- 
variant filter preceding the nonlinear operation. Then if 
Vr(é) is the output of this filter and if hp(t — 7) is its 
weighing function, so that 


Vd = [ Vek ea’, O<t<, CD 


we can write the sampled data in matrix form, Vr = QV, 
where Q = [hr(t; — t,;) At]; see Appendix V. Next, let us 
compute 

by VeVe = b¥N' DD VV Do holt: — tat 


ik 


(132) 
which suggests that we set 


(Dein = Shalt — t)helts — 4(A)® — (138) 


so that for band-limited white noise, sampled at the 
proper times ¢; = j/2B (in order that ky = I), we have 
again 


V, = V¥y'D.V = VHV = vi'VeV>. (134) 


By choosing hy beforehand, we then specify Dy a priori, 
as mentioned above. Note that Q in effect diagonalizes 
H (see Appendix V) and in addition that the subsequent 
nonlinear operation has zero memory. In fact, the entire 
system here is identical with the one whose block dia- 
gram is shown in Fig. 2 except that now log L,(V) is 
not in general optimum. 

For continuous sampling in white noise backgrounds 
we find, as in Section V-A, that the structure factor is 


Pr 
a I Vidoouti, aVG) diadi, .aCes) 
ON 
0 


where 


pr(ti, to)o = pr(ty ad ty, tr) o 


LF 
- if hel — tyohole — bode, O <b, <7) (136) 
0 
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as in (15), e.g., (Do);x — pr(t;, t.)odét in the limit B— 
(orn — o). The continuous analog of log L(V) is accord- 
ingly 


7 
log Lx(V(O) = Go + ae— f Valo? at 
ON JO 
ay 7. I V(t) pr(ts; t)oV(ts) dt, dt. (187) 
ON 
0 


Note that instead of a time-invariant matched filter and 
square-law rectifier, followed, as usual, by the ideal 
integrator [4 ( )dt, (Fig. 2), we can equally well regard 
VW, as computed by a linear, time-varying filter, and 
multiplier, as indicated in Fig. 1, following the argument 
of (74) and (75) above. 

Although we have used an ideal integrator for the post- 
rectifier filter, since the system itself is not optimum we 
expect by using some other post-detection filter that an 
improvement in performance is possible. Returning 
to the discrete case for the moment, we can express this 
now as 


vy ViVi 
= vw DI (Ve), 


v= 


Vr = [A(Vr):] (138) 
where the weighting A; is normalized in some fashion: 
A; =yrh(T — t;), withyr = f>hi(T — f)dt, correspond- 
ing to >\" A,/n = 1. The structure factor for continuous 
sampling becomes with the help of V,; = QV, etc. above, 


i 
ws = Ze ff Voth, t)oV(b) dt dt, (139) 
Wow 
0 


where now 
T 

pk, to = | yh(T — a)hele — t)oha(e — b) de, 
0 


O<st,4 <7). (139) 


We still have the linear filter (h), followed by the zero- 
memory (full-wave) square law element, but with a 
post-rectifier filter that is no longer the simple ideal 
integrator of our previous discussion. 

With the above in mind we can make several general 
statements about system performance: 


1) For the optimum receiver no improvement in performance 
is gained by using a post-detection filter other than the 
ideal integrator. Any other filter will decrease the 
effectiveness of the system. This follows at once from 
the fact that the optimum system is Bayes and is 
unique.*” 

2) When a nonoptimum receiver is used, post-detection 
filtering other than ideal integration can give improved 
performance. (By improved performance is meant 
smaller average risk. This average risk, R, however, 
from the definition of the optimum system as one that 


30 Bibliography [1], (2) and Appendix I. 
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minimizes average risk, can never be less than the 
Bayes risk, R*). By adjusting h,(7’ — 2) it is possible 
to alter R, so that some h; will lead to systems with 
smaller risk than when simple, ideal integration is 
employed (hy = 
R under these various choices, but from physical 
considerations we see that post-detection filtering can 
be used to enhance the signal v7s-d-vis the noise, if the 
predetection linear filter, h7, is not optimum. 


The above results are generalizations of earlier theorems 
found by Van Vleck and Middleton [8], who derived 
similar statements on the basis of calculations of signal- 
to-noise ratios at the output of a radar receiver. Here, 
however, the decision process is explicitly taken into 
consideration, and all information in the received wave 
as well as a priori data is used. 

We note, finally, that to compare performances of ideal 
and nonideal systems, the error probabilities a, 8 must 
be obtained. These follow as in Section III, and for the 
structure (129) assumed here we get after some manipu- 
lation 


po logK 
a= | Qy(a)o da; e= |[ P,(x)> dx (140) 
logK —oco 
where 
foo) Can ee dé 
Q.(@)o = ie [det @ SED ere 
roo) Pa dé 
P,la)o = ib (det [I — 7€D,(I + ajD)]'” 2x (141) 


with a = Ws/Py once more.*’ The average risk is then 
computed from (3). 

Exactly the same sort of relations are found when the 
signal is narrow-band and an alternative structure using 
the ‘‘components” V.>, Vs of the received wave is em- 
ployed. We have, analogous to (127), 


% ro) NE aa page 
Q.(y)o = he det (I — 7€D/) 2m ’ 
a ce) —it(y—Go’) dé 
Plo = fhe det [I= 42D do aD)-2n ee 


in which G/, Dj are the corresponding narrow-band 
versions of G, and Dy. Trace methods enable us to evaluate 
a, 8 in the general case (141), as well as (142), for threshold 
reception, while the exact procedure using the eigenvalues 
of Dj, etc. may be successfully applied, as in Section VII, 
for all signa] strengths when the signal is narrow-band. 


IX. ConcLusion 


In the preceding sections we have formulated the 
problem of the optimum and suboptimum detection of 
normal noise signals in (additive) normal noise back- 
grounds, using the methods of statistical decision theory. 


51 We assume that the noise is band-limited and white, yy = 
BWow, and that sampling occurs at the intervals t; = j/2B, so that 


v= 


June — 


T). To prove this, we must calculate — 


n,Q, 


957 


fhe general solutions for both broad- and narrow-band 
ignal processes in either colored or white noise have been 
udicated and particular attention, along with a variety 
f specific results, has been given to the important class 
f systems that operate against white noise. While our 
nain effort in this regard has been directed to weak- 
ignal, or threshold operation (which is usually the more 
mportant situation in practice), the general structure of 
he optimum receiver has been found for all [(a@ prior7 
own) rms] input signal levels and is represented in 
erms of physically realizable elements. The trace-method 
ntroduced here (Appendix IT) enables us to obtain results 
n the threshold cases without having to calculate eigen- 
values explicitly and is convenient from the point of view 
of approximation when it is necessary to determine error 
wrobabilities and average risk for system operation. 

It should be pointed out that the above detectors are 
yptimum under the assumption of a known (rms) input 
signal level (as well as rms noise level). In many instances, 
10wever, this (rms) input signal level is not available to 
he observer, who at most has simply a distribution of 
such possible values. Then optimum performance must 
sake this into account by suitable average in (F(V | S))s, 
5), (7), (8), and we may expect that the structure of the 
‘esulting system, including the bias, will be noticeably 
ultered in many cases. Our present results can still be 
smployed, if we sef aj, or Ws, at the level appropriate to a 
riven value of minimum detectable signal; our system is 
optimum for all input signals of that particular strength, 
out is no longer optimum if the actual signal level is either 
above or below this value. This however, may not be 
serious, since above and below this value our receiver will 
still respond reasonably well: below this level we will 
reject the hypothesis “signal plus noise” anyway, and 
above this point we will accept it, albeit without the 
sensitivity of an optimum system. 

Finally, we remark that the subject of the detection of 
‘random signals in noise is by no means exhausted here. 
Detailed solutions for colored noise backgrounds, the 
yuestion of the unknown input signal level, strong-signal 
yperation, and explicit calculations for suboptimum 
systems, including comparison with corresponding 
»ptimum receivers, all remain for later study. Extension 
to the case of fsk signals, and problems of more compli- 
sated signal waveforms (although some aspects of these 
yuestions are covered rather thoroughly in Price’s work 
4]) are also topics for further investigation. 


APPENDIX I 
Repuction or Der (I + 7G); Ergenvatu—E Meruop 


A. Preliminary Remarks 


One of our principal technical problems in obtaining 
iseful expressions for the bias and the error probabilities 
issociated with the detection systems discussed above is 
he reduction of quantities such as det (I + yG) to a form 
nore convenient for analysis and, ultimately, for compu- 
ation. In Appendix I we shall outline briefly one such 
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method of reduction and some of its byproducts, and 
reserve for Appendix II the discussion of an alternative 
approach. 

Accordingly, we begin by assuming that: 

1) y is in general a complex quantity; 

2) Lis the unit matrix; and 

3) G is an (n X n) matrix all of whose elements are 
real quantities. For the moment we shall not require that 
G be symmetrical. From assumption 3) we can always 
find an (n X n) matrix Q which diagonalizes G by the 
similarity transformation” 
Q'GQ =A = [\;6;,] or GQ=AQ=QA, (143) 
where \,;(7 = 1, --- , n) are the n distinct eigenvalues of 
G and 6,, is the familiar Kronecker delta 6;, = 0, (j ¥ k); 
6;; = 1. Thus, if we let the (column) vectors, f; = [f,. );} 
be the eigenvectors, corresponding to the eigenvalues ); 
in (148) above, the diagonalizing matrix Q is then simply 
the (n X n) matrix formed by the f; as columns,” e.g., 
Q-= [f,, f:, °°: , fs. °° 5 tol el akings the: heroweand 
jth column of GQ = AQ, (148) above, we may write 


(GQ):; re >» GiQe: = NiQi:, 


or since Q,; = f,; we get for the /th row of (143) 


(144a) 


Do Goa Te Nifiiy (l v7 il i SOE, Qj = it A ee n), (144b) 


or in vector form 


Gf. At. Cee ie (144c) 
The eigenvalues are first found by solving the secular 
equation det (G — XI) = 0, while the eigenvectors are 


then obtained from (144b), except for an arbitrary 
constant in each component. When G is symmetrical, Q 
can be made an orthogonal transformation, e.g., QQ = I, 
and the arbitrary constants are removed. The ortho- 
normalizing condition for the n eigenvectors is then 


SS QiiQe; = 513, OF De Surfs = 01;, 
k=1 =i 


while the eigenvalues are found as before from det (G — XI) 
= 0, and the eigenvectors from the n linearly independent 
relations (144b), subject now to (145). For symmetrical 
or unsymmetrical matrices [satisfying (3)], in any case, 
we can use (143) and the fact that det (AB) = det A 
det B to write** 


det I + yG) = det (Q’Q + 7Q'GQ) 
= det I+ yA) = fl (1b 4X;). 


(145) 


(146) 
For the case of a colored noise “signal” in a white noise 


2 See, for example, Bibliography [12], (10.15a). Also (10.14) 
and (10.9c). 

33 See Bibliography [12], ch. 10. 

34 Even when the eigenvalues of G are not distinct and G is 
unsymmetrical, it is still possible to write I1”;=1, (1 + yA,) for det 
(I + yG), but there no longer exists a matrix Q, and hence a set 
of eigenvectors f;, which can be used to diagonalize G, see Bibliog- 
graphy [12], (10-15b), as G cannot then be put into completely 
diagonal form. 
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background, G is proportional to ks, the covariance 
matrix of the signal, which is symmetrical; Q can therefore 
be an orthogonal matrix, and we can relax the requirement 
above on G that all its eigenvalues be distinct. On the 
other hand, for the more general problem of colored 
noise in colored noise G is proportional to ksky', and 
although both ks and ky are symmetrical, their product 
is not (except in the unlikely situation that ks and ky 
commute). To diagonalize G as in (148) above, we must 
reimpose the condition that its eigenvalues be distinct. 
For the physical processes discussed here, this is assumed 
always to be the case. 


B. Continuous Sampling 


In most of our present applications continuous sampling 
is ultimately postulated. Thus, in going from the discrete 
to the continuous process the intervals between sampled 
values are allowed to become arbitrarily close, while the 
total number, n, of sampled values becomes infinite. Two 
situations are distinguished: 1), where the observation 
interval (0, 7’) remains finite; and 2), where T — ©; in 
either case it is assumed that lim,... G goes over into 
G(t,, t.) all 4,, tf, in (0, 7), where G has suitable continuity 
and convergence properties. Our determinant (146) 
becomes 


Dr(y) = lim det (I + yG) = 


Para); ae 
no 7=1 
7(y) is known as the Fredholm determinant [12] and 
since the eigenvalues of G are all distinct, we may have 
recourse to the usual Hilbert theory to write \}” as the 
limiting form of the eigenvalues \; when n > ~, (T' < ~) 
[see (149) et seq.]. The determinant 9,(y) is absolutely 
convergent [13] for all ( < (4, t) < 7) provided 
| yG(t, t2) | < Me, where M, is the maximum value of 
| y@ | in the region 0 < (4, tg) < 7’. The same condition still 
applies in the second situation of infinite observation 
periods (7’— ), where the region of convergence is now 
(0 < 4, tg < ©), and where Dr(y) — D.(y). 

We shall need some further properties of the eigenvalues 
d;, §”, etc. Returning for the moment to the discrete 
case (n < ©) again, we may write the well-known result 
for any matrix G (here with distinct eigenvalues): 

>> v7 =traceG”, (m > 0). (148) 

7=1 
To obtain limiting forms as n —> ©, let us first require 
that T < o and then set ¢, = kT/n, At = T/n, etc. We 
assume that lim,... T/nd; = }” exists (all 7), with \{” 
discrete, (an assumption here that follows from the 
Hilbert theory of integral equations). Multiplying both 
sides of (148) by (7'/n)” and passing to the limit then 
gives 


1 


2 | af Gn OGhabe GC) he ode 
0 


eB (m > 1). (149a) 
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For T — ~ we get in similar fashion 


1% roy" = [. : / GG, Bh)? @G fh) ain are 
(149b) 


= Bo, (m2 1). 


In this last instance G must be such that the integrals 
exist, or equivalently that the series converge. The Bi”, 
B“ are known as the iterated kernels of a certain class of 
integral equations, described below briefly. 

Another useful expression for det (I + yG), which may 
be carried over to the continuous case with the help of the 
above, is given by the expansion of the determinant as a 
polynomial in y:°” 


det hy Gea ae 
m=0 Ms 


Gi, Oa Gigs 
D® = nt, GHG : 
Gy eee Ga 
CRS (150) 
Evaluating the determinants D® shows that we can write 
D as a function of the traces of G, G’ --- , G”, viz.: 
D& = Ds (trace G, trace G*, --: , trace G"), (15H 


The corresponding limiting form for continuous sampling 
(n — o) is given in (161c) following. 

_ The integral equations from which the d{”, dS are 
found can be obtained directly from (144b). The matrix 
G goes over into G(t, t,), the eigenvectors f; become the 
eigenfunctions f;, and instead of the sum in (144b) one 
gets an integral. Formally, let us multiply both sides of 
(144b) by T/n, with 4, = kT/n, At = T/n, etc. as before, 
and pass to the limit (n — ©), making the assumption 
above that lim,.. (7/n)\A; = {” as justified by the 
Hilbert theory, if the dS” (and ;) are discrete and 
distinct (and if there are at most a finite number of 
(finite) negative eigenvalues \‘””). The result is 


[ GG 21Ordr=NFOr O<tS 7); 
(= 
[GG 010 ar =H, 


Leek (152a) 


7, ©) 


(O13 <= 2), G2 
For discreteness of the eigenvalues we require that G(¢, 7) 
be quadratically integrable*® in the respective regions 
O<4,<T), 0 <i, < o). If G is symmetrical, or 
equivalently if G(t,, t2) = G(tz, ¢,) in the continuous case, 
it is a straightforward matter to insure that the eigen- 
functions belong to a complete orthonormal set; the 
condition for this follows directly from (145) and is 


%° See Bibliography [13], (15), p. 24. 
a i ayy for example, Bibliography [14], the footnote preceding 
.10b). 


T ,(@) 
[ fOh@ at = 6, (1520) 
where the f;, f, are, of course, solutions of (152b) if 
(T — ~). Notice, however, that the integrals (152) apply 
in any case, even if G(t,, f,) does not equal G(t,, t,), as 
long as the eigenvalues are discrete. 
_ Eigenvalues and eigenfunctions for a number of different 
kernels G are listed in the table below. Although we 
consider only Cases I and II specifically in the present 
‘paper, we include some results for more involved situa- 
tions as well, along with a brief description of the type of 
problem in which they may occur. Details of the solutions 
are available in the references. The kernels here take the 
general form G(é,, tf.) = A(t) K(| ¢, —¢.|), where [., A(é)dt 
exists as indicated in Table II. 
For case I in Table II, approximate expressions for 
eigenvalues of high order are readily obtained. To a 
first approximation we may write 


2xa°T g 
Case I: he 2 Se : : 
mais i PG — I+ 9449)’ 
{o> 1 g =cT. (153) 


Higher order corrections to these results may be found by 
successive approximations. We note for all the cases listed 
in Table II that the eigenvalues are positive (since A, 
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x > 0) and approach zero 0 (j”) as 7 > ©. The various 
iterated kernels, BS”, BS”, (149a) and (149b), are there- 
fore finite, since the series in \{””, etc. are clearly con- 
vergent (m > 1). In some instances, we may have negative 
eigenvalues, as well, if A(w) is less than zero for some 
ranges of u, but at most these will be limited to a finite 
number of (discrete) eigenvalues.*” We remark, finally, 
that for certain choices of the parameter v C/o wat 
Case ITI, viz., » = 1/2, 3/2, the eigenvalues can be given 
precisely for all orders: 


Case rit: Yor (He 
Af (q) = [2 cos a, = 0; 
2 qi Tq; 7 5] 
Abst 8ceKa” 
S25 
3/2 qe 8 Peles 
i : 
Dap 0Os = 2a =); 
1/2\4;) = Nig; qi } 
2 
~ hy = ced 
is 


87 See, for example, Bibliography [14], the discussion following 
(4.15b), where A(w) may represent the weighting function of a 
CR-video filter, e.g., A(u) = 6(w — 0) — (RC) + e-«/Re, u > 0-. 


TABLE II 


THE INTEGRAL EQUATION 


T 
[ AWK t= wf du =F, O<t<7) 
0 
Model [a?, 6, n, K > O} A(u) K(|t — ul) Eigenvalues Eigenfunctions 
I) [6 = 0; T < @]; distribution of [,7 x(¢)*de; a Pye 9 xa? Alo AB (ears aor 
x(t) a Gauss process with zero mean; 1) 4; = eG aa) ) ‘ 
x = RC noise, or 2) x = high-Q, LRC noise i . A! cos y;(t — T/2) 
tan (eDgh iy Seen in eet 


all OR > 0, 
ae al) 


ly; = cq; | 
1 = cv 2xa’/cerd; — 1 


iO < @ < ©): 1) Markoff scatter; —cl|t—u| 
A(u) = (ae-™)?, optimum detection of RC 


or high-Q, LRC noise “‘signal’’ in white noise 


2 .—2bu 
é 


q; (positive) roots of 


BP. 7,(ge"%) 
Jruile "9g, N,-1(g,) = 


J,-1(4i) 


[4]. 2) RC or high-Q, ae ee ae 8 
square-law detector, followed by an y = 
Bideo filter, A(w). Distribution of [7 A(¢—u) Neale, -O:) sons + N,(qje ) 
I(w)?du. ae 
E eee 
q; Nabe 
p= 6/6 
Ill) (1 — ); 1) As in 1), II above. 2) As 2,—2bu pli | “ys one Snare 
in a II above; Juncosa’s integral equation Oe ne qi (positive) roots Aes} Pr, Cie 
[15]. See also Kae and Siegert [5]. i OG: F 
_ , [Rexa? 
qi me Nb: 


p = 06/ 6 
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and these results may then be used to sum the various 
series for the bias, error probabilities, and average risk, 
° . 38 mM . 5 . : 

in these special cases.” The similar nature of these integral 
equations stems from the fact that the underlying prob- 
ability mechanisms are all normal, although the problems 
in which they arise are physically quite different. 


APPENDIX II 


LEDUCTION OF Der (I + yG); Trace Meruop; 
EVALUATION OF INTEGRALS 


A. Fundamental Identity 


With the (n X n) matrix G of Appendix I-A, let us 
examine the determinantal expansion (150) of det (I + yG) 
in more detail. From this, in fact, we can establish the 
following identity, which is basic to the so-called trace 
method for reducing expressions like det (I + yG) to more 
manageable forms and which is particularly suited in 
detection (and extraction) theory to problems of threshold 
performance.” The identity in question is specifically 


foe} = 5) m. 
exp = DS a vy" trace or 


m=1 


= det I + yG), (155) 
which holds whenever the exponential series converges. 
Proofs of (155) can be given in several ways. A direct 
method, based on the determinantal expansion (150), is 
to develop both sides of (155) in a power series, compare 
coefficients of y", and observe for all k > n that the 
coefficients of y" are identically zero. For k < n we get 
simply the determinantal expansion (150) in both members 
of (155), with Di” the coefficient of y". Thus, from (150) 
we have (whenever the m series is convergent) 


exp > = y” trace or = ee Poe ECG) 
and the coefficients D\” are specifically 
Dito 1 Di = trace G; 
Ds” = trace’ G — trace G’; (157) 
D\? = trace’ G — 3 trace G trace G’ + 2 trace G’; 
D\” = trace’ G — 6 trace’ G trace G* + 3 trace’G’ 


+ 8 trace G trace G* — 6 trace G*, etc; 
with the higher-order terms obtainable from (150), or 
from the expansion of the left member of the identity 
(155). 

This direct approach, however, does not readily reveal 
the conditions of convergence for the exponential series. 
A more elegant demonstration of (155), which at the same 
time establishes the desired interval of convergence, 
starts with the relations (146) and (148), v2z., 


det (I+ yG) = I] (1 + yA;); 


De trace. G™ (mee (158) 
j=1 


88 See Bibliography [5], (7.24)-(7.29), for example. 
39 To the author’s knowledge, the particular relation (155) does 
not appear to have been noted previously in problems of this type. 
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and writes 
det (I + yG) = exp {log RI (1 + my} 


= exp by log (1 + ~)p (159) 


Now let the m distinct eigenvalues \; (j = 1, --- , n) be 
arranged in descending order of magnitude, with | \, | the 
magnitude of the largest, e.g.,|A1| > |Ao| > --- > JA, | 
> --- |X, |. Then the logarithm can be expanded in an 
absolutely convergent series for all \;, provided | yA, | < 1, 
to give 


© 


SD) Tim 


m=1 


=a m+1 Ae n ‘ 
bo) | Ax | <a 


m TAS 


2 log (1 + Ms) = 2 


(160) 


where the interchange of series is permitted, since the m 
series is absolutely convergent | yA, | < 1. But the series 
over j is just trace G”, (158), and so the identity (155) is 
established. The region of convergence in the complex 
y plane is determined solely by the largest eigenvalue of 
G and is a circle of radius r < | \, | *. For example, in the 
calculation of the bias term, (25),7 is equal to aj, so that the 
left member of (155) is an exact expression for det 
(I + a3G) whenever aj < | \, |‘; det (I + ajG), as given 
by (150), is, of course, defined for all aj (< oo). The 
convergence condition suggests, then, that our exponential 
representation should be particularly useful in threshold 
situations, where aj is O(| \, |) or smaller. [Note, incident- 
ally, that starting with the relation >>” \" = trace G”, 
we can also establish the expansion (150), or given (155), 
we can in a similar way prove that (148) is valid.] 

In the limit (n — ) of continuous sampling, trace G” 
is replaced by the iterated kernel BS” in the identity 
(155), while 4; — »f”, and det (I + yG) becomes the 
Fredholm determinant D7(y); see (148), (149). To see 
this, we simply repeat the steps (158)-(160), where 


lim >> 


no j=l 


(7) NP ee tin (2) trace G">= BY, -(6la} 
n aN) 

in the manner of (149a). The basic identity (155) is now 
specifically 


Dry) = [] Gd +-yA;”) = exp Ds cor at, 
7=1 m=1 


sill ys) NG alee (161b) 


with similar results for semi-infinite observation periods 
(f — o). From this the continuous analog of (151) 
follows at once on expanding the last member of (149b) 
as @ power series in y, namely 
Diy) = UD, Gee lene 
m=0 : 


Observe that this relation and the first equation of (161b) 
are in fact valid for all y, since the Fredholm determinant 


15 


absolutely and permanently convergent for the kernels 
assumed here [13]. 


Evaluation of Integrals: Characteristic Function and 
istribution Densities 


Let us now use both the fundamental identity (155) 
id the eigenvalue form of det (I + yG) to evaluate a 
ass of integrals that arises in detection and extraction 
‘oblems whenever normal noise is present. It is convenient 
» consider this as a problem in probability distributions. 
'e write first 
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= / exp {1 log A,(z)} W,(z) dz 
(Z) 
= exp {iC, — 32F "2} 


fo / exp {tZ(€C, —7F *2)}exp{—32z(F '—2C,)z} dz, 
25 (20)? V/det F 


this last from (162b) and (163). Since det A = (det A~’)~’, 
we use the relation*® that 


[ef exp {zu — 3zAz} dz 


F (ig), oe 


« = log A,(2), (162a) = (2n)"(det A)” exp {—30A ‘u} (167) 
to obtain finally 
exp {—42F ‘(I — av '(ié)2 + 2&[Cy + 32{A7 GE) + FA“) F}C,] — 32°C, A (4) FC} 
, (168a) 
Vdet A(zé) 
where 
here specifically the transformation between z and w is A(ié) = I — 2€FC,. (168b) 


ae log A,(z) = Co -- Cz) 22C3z, (162b) 


which C) is a scalar, C is a column vector, like z, of n 
ws, and C, is a (real), symmetrical (n X n) matrix. We 
sk now for the distribution density W,(x) of the random 
awriable x, when z is normal, with mean values [2] and 
wariances [(z; — Z;) (@ — %,)], ¢g., When the distri- 
ition density of 2 is 


1 


2) = det F 


sexp {-3(2 — DF (z — 2)} 


(163) 


— 2,)(@ — %)] = [Fu] = F; 

he (~) indicates the transposed matrix]. 

The probability density W,(x) can be expressed in 
rms of its characteristic function F,(zé), as 


F = [@; 


2 Pe ee. 
"(x)= ihe F (1) .e a = ) 


with F@®.=[ Wi@e* dr. (164) 
sing the transformation (162a) we can alternatively 
rite W,(x) as 


> 


(165) 


W.@) = W,,(z) 6(a — log A,(z)) dz. 
(2) 
ith the help of the representation 
x — log A,(Z)) 
dé 
= [exp {ite — log A@)} 5 (166) 


» find that the characteristic function of « becomes 


(it), = e** = (exp {2 log A,(z)}). 


Our next step is to put the exponent of (168a) into a 
form more convenient for manipulation. We let G = FC, 
and then diagonalize G with the similarity transformation 
Q; see (143). We note, then, that 


A(ié)' = Q{Q* AGE *Q}Q* = QI — EQ“ GQ] 'Q 
= QUI — Ay Om A = [06);6;,]. (169) 


Applying this to the various quadratic forms in (168a), 
we easily find that 


st — -ls CS oe = EX ; 
ZF a(T aa A) 1% a —ZF ps3 Tr, — | | 


ZAC, = 77.C,; iy = | | (170) 


ZF ‘A 'FC, == Z(I aa INS ACS = VA DAC. 
GrA; FC; = CiEEIC. 
Letting 
Be alk Vx = Q;; 2;-C, = b;; 
k 
C; as CP i, = Cj, Gen) 
i 


we can write the characteristic function (168a) alter- 


natively as 


F (28), (172) 
— pike I (ae WO se) Sees = i) 
j=1 dd ra a ae i 
where we have used (146) for det A(zé), with y = —2é. 
The probability density of x, (164), cannot be given in 
closed form, generally, even if a; = 6; = c; = O (all 9). 


With the help of the identity (155) we can, however, 
obtain an asymptotic approximation which is useful in 


40 See Bibliography [7], (11) and (12). 


110 


all cases of threshold performance. To do this we first 
expand the exponent of (172), retaining terms O(€, £) 
only, and then develop the rest in a series in £. For example, 
the exponent of (172) becomes 


= Ji&(b; ++ Aas) = igs.) 
Ste 


2 


2 
= BY ig — 5 BP +0@8), 173) 


where 
n 


ne = E (402) 


7=1 


n 


De E + ar,(o, +; ei etc., (173a) 


q=l 


) 
ES 


Il 


and higher terms are readily computed in similar fashion. 
From (155) we have 


TT a = wy” 


exp > S08 (—7€)” trace ar}, 


2m 


exp (i trace G — © trace G’ + oe}. (174) 


so that the characteristic function is now 


F,(it), = e*(C, + Ey” + 3 trace G) 
-exp ‘5 (ES + 2 trace eh + O(é)’)}. (175) 


The corresponding probability density is asymptotically 
normal, with mean and variance 


& = Cy + EB,” + # trace G; 


2 


cg — # =o. = E,” + # trace G’, (176) 


V1Z., 


W,(2) = exp be a #)"/208\ : 
V/ 210; 
Correction terms, revealing the approach to the normal 
law, are found in straightforward fashion from the Fourier 
transform of the terms 0((c£)*), and higher, in the charac- 
teristic function (175) above. The method is illustrated 
presently in a number of special cases [see (177) et seq.] 
and is recognized as one form of the method of steepest 
descents. 
Although we cannot obtain W,(«#) from (172) in closed 
form, it is possible to calculate any desired moment of x 
exactly, with the help of 


(177) 


re ee LL nn Po 
v ra ( 1) dé” F (i), ated (178) 
The first two moments here are found to be 
E=0C,+ FE.” + i trace G, (GEFEOs) (179a) 
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a = ES” + } trace G’? 

+ (Co + HE,” + 3 trace G)’, 
and higher moments, though more laboriously determined, 
are calculated in the same fashion from (172), viz., (179a), 


(179b). In a similar manner we can determine the semz- 
invariants, or cumulants, L,, from the defining relation 


(179b) 


co 


FiGi), = € "= =.exp 1 G"L/m}, (180) 
as is well-known, L, = @, L, = x” — #’, ete. 
A modification of (162a), (162b) which is useful in our 
present study of narrow-band models is 


Y = log Naas Zs) 5 


log A,(Z1,; Ze) =9CG +> $Z1Ciz1 4 220s Zee (181) 


where now z, and z, are normal random vectors with 
zero means and zdentical variance matrices F. Further- 
more, Z, and Z, are statistically independent, so that their 
joint distribution density is 


Won(Zi, Z) = W,(Z1)W,(2Z2) 


_ exp (—42,F ‘z, — 42,F 'z,) | 
5s (2n)"{det F} » eee 


[see (115)]. The characteristic function for y, (181), is easily 
found by inspection of (168a) or (172), if we observe that 
now z = 0, C, = 0, and 2, z. are independent. The 
result is 


F (ig), = e°*°*' {det (I — 2gFC5)}* 


=e" TT (= yy, (18a 
j=l 

where the \/ are the n distinct eigenvalues of the matrix 
G’ = FC), (see Table II). Unlike the more general case 
(172) considered above, we can here obtain the distri- 
bution density of y without recourse to approximations, 
since the singularities of F,(zé) y occur at € = —7/d and 
are all simple. Contour integration applied to (183) 
gives directly 


Wi) =f 


—o 


en ce iB (1 =s END) dé 
j=1 Qr 


n+ eg Peer nt 


PieeGee oo: 


/ 
k=1 Ni 7=1 


YP, 


Finan ea (184) 
pe Pam ae GIS a", 
y<% 

where n* + n” = n, and n* and n° are respectively the 


number of positive and negative eigenvalues of G’ = FC/. 
With continuous sampling, G’ is restricted to have at 
most a finite number of negative eigenvalues, while 
n* —» ©. The relations (183), (184) apply here also under 
these conditions. In our present work, (cf. Table II), all 
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yenvalues are positive. Relations similar to (183), 
84) were obtained originally by Kac and Siegert [5] in 
eir discussion of the problem of distributions after 
uare-law rectification and linear filtering. 


_ Weak-Signal Expansions 


In threshold operations we can always apply the trace 
ethod and that of steepest descents to obtain asymptotic 
pressions for the distribution densities W,(x) and W,(y) 
ove. It is instructive first, however, to examine the 
aracteristic functions (183), and (168a), (168b), and 
72) when C, = 0 = z. With the aid of the basic identity 
e have now 


hp. = oe a7 Wa- ey 
= exp | ec, + 5 O” trace 6° | 
(C,.=2Z=0), (18a) 
(8) = Fa i = of FOo" I (1 — i€n,)” 


= exp ec + 5 we trace (en |. (185b) 


rom the defining relation (180) for the semi-invariants 
e can write at once 


iz) 


= C, + 3} trace G; 


ert 
Loa = (1 —U! trace GG (ie 2): (186a) 
id 
wm) = Ci + trace G’; 
Linwy = (m — 1)! trace (G’)”, (im 202) (186b) 


‘hen the sampling is continuous in (0, 7’), det (I — 2G) 
cc. becomes ©;(— 2€), and trace G”, trace (G’)” are 
placed by the appropriate iterated kernels B‘”; see 
49a), (149b). Similar remarks apply for the (semi-) 
finite observation period (0, ~). 

The threshold distributions of x and y are (asymptoti- 
uly) normal, with correction terms as indicated below. 
o obtain this result, let us again retain only terms 
€, £*) in the exponents of the characteristic function, 
id develop the rest in a series. Using (180) we may 
cordingly write in general, 


(ie) = exp (1, 


= exp {t&L, — £,/2}41 a oe 


LAGE LAGE: 
+[ 4! = 72 | 
_ | #EsGH" , EsLalis)" Hs 
| Rca) 144s gin amir omc 


ditwe Normal Noitse—Part I 111 


which is an expansion of the Edgeworth type,*’ chosen 


because it leads to truly asymptotic representations of 
W(x), Wily). 
From the fact that 
[Go exp (a4 - £8/2) 8 
- 2r 
ad’ Cnt 


dz* \/ Dg 


we obtain from (164) 


= [AN 


(188) 


1 

Ve Vie {p°(2) + C36 (2) + (C9 @ + Ce (2)] 

+ (6.6) + Co" @) + Ce Cs) 
where now [16] 

d* =e ae Li, . 
¢°(2) = a Ue and eee ae with 
Cn 3S Oh 
C= = 15/4 OC, = —13/6'L2”, 


and (186a), (186b) apply respectively here for our par- 
ticular characteristic functions (185a), (185b). Again, for 
continuous sampling, the appropriate iterated kernels 
take the place of trace G”, etc. Note that in this form, and 
under these conditions, it is not necessary to calculate 
the eigenvalues of G, or G’. For W,(y), of course, exact 
results can be found, (184), but for W,(x), (185a), this 
is not possible, and our only approach is as given above, 
for threshold performance, and is not valid when the 
weak signal condition is removed. 


AppEnpDIx III 


TRACE CALCULATIONS AND ITERATED KERNELS IN 
THE WEAK-SIGNAL THEORY 


A. Some General Relations 


To obtain the bias and the semi-invariants with the 
coefficients C,, cf. (185a)-(189), required in the threshold 
theory, we must consider the dependence of these quanti- 
ties on the input signal-to-noise (power) ratio aj and 
develop the results in suitable ascending series in this 
parameter. We find for the situation of discrete sampling 
(with background noise of finite total intensity) that the 
matrix G takes the following forms: 


(G)pias = aoD; Gy — I a (I a GD) 


Gyin = GD: 1D) = ade (190) 


The quantity of interest here is trace G” (m > 1). Let 
us accordingly consider the case of noise alone first (as 


41 Bibliography [7], (17.7). 
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indicated by the subscript N), expanding the matrix Gy. 
Taking the trace gives 


trace Gr = >) (—1)2C, 
l=0 


= (-1)'q +h— D! 
= qui — 1)! 


[a> trace Do 5 


a;" trace D’, (191) 


q=1 


where ,,C, is the binomial coefficient m!/(m — 1)!1!. This 
can be written more compactly as 


foo) 


trace Gy = >> (—1)"b,,, 03° trace D%, 


a=1 


(192a) 


with 
One = » OF (a+1-1C q) (—1)' 
1=0 


= NA > BCR OA Bes (Opies iyi <a 1D) 
==) quits — 1). (192b) 


In terms of the eigenvalues \;”” of D, trace D* is alter- 
natively >>” (\{”?)*; see (148). Similarly, for signal and 
noise, we have 


trace Gea, = a," trace-D”™ = ax De I (193) 
j= 


When continuous sampling is employed in the general 
case (D = kgky’), (192a) is not a convenient form, but 
for the problem of a white noise background (D = k;) 
it turns out that these various expressions for the trace 
operations go over into comparatively simple expressions 
involving the iterated kernels of the signal process. To 
see this, we recall that the mean intensity of the back- 
ground noise is now limz..WoyB = lim,.. (n/2T) Won, 
where Woy is the spectral density of the interference, so 


that for this white noise 
ee re 
‘(2H ss trace ks , 


(At = T/n), 


$ Y ™m ms n 
= VS lim (¢ At SS (ks) tit» eis (s)an} 
ON licstlm 


noo 


lim {a,” trace D”} = lim 


no n> co 


=? G tT Be ah (194) 


with of = WsT/Woy an input signal-to-noise (intensity 
ratio), and (B‘”)., the iterated kernel for the signal, e.g., 


(Bn )s (195) 
a [- F / kes(&, t)keg(ts, i) OR kis(tn; t,) dt, as Oi 
0 


If the signal process is stationary, the symmetrical kernels 
ks(t;, t.) can be written kg (|t; — ¢,|); note that (BX”) .T-” 
is dimensionless. 

B. The Bias 


With Appendix IIJ-A in mind we can write the bias 
in a number of alternative ways, of varying utility for 
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computation. Returning for the moment to the genere 
case D = ksky’, we have 


IR = log [Un 4 log det (I ae Goias) 
= log u — } log det (I + aD) 


= logu + log [I A + anf?) 77. (196 
j=1 
In terms of the identity (155) this becomes 
To = logu + sh a,” trace D”, (197 


and the series is convergent, provided aj | \;” | < I 
where \{”’ is the largest eigenvalue of D, ef. Appendi 
II-A. 

For continuous sampling in (0, 7’) we have from (161a) 
(161b) 


Ty = log u — 4 log Dr(as) 


ao" [Bn Ip, (198 
where (B‘”], is the iterated kernel for D = kgky’ 
namely 


[Benils (198a 


TF. 
= [---f Dit, DG, &) +++ Dll fh) dt +++ aby. 
0 


The second relation in (198) is equivalent to the logarithn 
of the Fredholm determinant (which is absolutely con 
vergent for all a2 > 0), as long as az |[\Jp| <= ae 
which (\‘”)p is now the largest eigenvalue of (152a 
with G(t, 7) replaced by D(t, 7). In the case of a whit 
noise background, similar remarks apply to the appro 
priately modified version of (198), namely 


= = ill a m —m 
y= log at Doe ener ae 
m=1 


(194), (195). 


(199 


C. Narrow-Band Signals 


All of the above carries over directly for the alternativ 
treatment (see Section IT) of narrow-band signals. Instea 
of (190) we have 


(G)uies — (cD Gye ea De 
Ghow = 5D!) Die ki) eee 


where kf, ky are given in Section VII. The semi-invariant 
are modified according to (186b) while the bias is now 


Ty, = log » — log det (I + a;D”) 


= logu + log [J {1 4+ af0?°}7, ~— (201 
while (198), (199) become alternatively 
ro) a | m se ; 
Ty = loom 5 fhe Ge (Bae lpeemerat (201e 
m=1. 
T's [erie = loge DU (D2 Be Is Om 
m=1 
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is last for white noise.*” The eigenvalues and integral 
uations for them have the same form as for the general 
proach, but, of course, may be quite different in detail. 


. RC-Noise Signal (White Noise Background and Con- 
vous Sampling) 


Here the signal process is obtained by passing originally 
uite noise of spectral density Wy; through the RC 


Vs (t) 


aes 


Fig. 8—An RC-noise signal process V,(é). 


ter of Fig. 8, and taking the output across the condenser 
The spectrum and covariance functions are here 


) oat Av son is — —wFl|t| 

s(f) ae Sy See Ks(6) = P se 5) (202) 
Ws = Woswr/4; wr = (RC); = 2rf. 

or the normalized kernel ks(t) = e °*''' it can be 


own after considerable, though direct, calculation that 
e iterated kernels (B\;’)s here become specifically for 
e first five orders 


: WA+e™*—1 
De a 
Pi BO Dit Oct De *\, 
3, RC D) ne 5) 
267 Me —2h 5 —2h ek 290 

—2 —2) =2X —A) 
T) eee G 3e 206 te 7 
Ro = OL ‘e ar D3 i iG 

12 —2) =e 13 
+ Hes , (208) 


nere \ = wp’. If d is large (the case of usual teal in 
reshold theory), we may approximate the B{7?,¢ above 
r all orders, omitting the oncauals and retaining 
ly the smallest powers of \~” in each case. We find then 
fips) Ei BW is given approximately as jthe 


efficient of \~”**, n>1, in the expansion of (1 — 2/d)~"””, 
Bye 1 
Boe t bo = Ee > nn ™ + OA”), 
(m = 1), (204) 


., while \ is normally 0(5) or so, for reasonable approxi- 
ations (and is exact for m = 1). 

On the other hand, when d is small compared to unity, 
rresponding to short observation times, we can derive 


4 We have now used B = n/T for each component (Nc, Ns) of 
e sampled noise and signal waves, in place of B = n/2T above 
‘ the broad-band representation, see (194) e¢ seq; the total number 
sampled points is still 2B7', since there are two components in 
e former case. 
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some useful results for B{”., by expanding the kernels, 


1.€., k(t; —b,) =1—wr |t; —t. | + (7/2) G —h)? + °° 
After some manipulation, we obtain 


de 


Bes = ie — wpmr | | t; = 


D, 
2 
+2 mt | kids = 
0 
T 
+ amr fof lt = 
0 


: | bi+1 a bie | dt; dt; +1 dt; +2 


+ mim — ayrn* fof 1g —t 


bay Pt; dia, 


Gr i dt; Gla 


jar | 


j+1 | 


-| poled | dt; dt;+1 dt, ana} oe an) (204) 


and carrying out the integration gives finally 


1 — mA 


—mp(T) 
T "Briere = 3 


: [ae ie eau. | + OM), (m2 2). (205) 


60 


Inserting (204) and (205) into the expression (199) for 
the bias, we may sum the series to obtain 


To,2¢/= log p= Af + — 2 —1 4 O(\”, 00) (206a) 


log (1 <= 263) Mae 203 ) 
oy T 3 11595, 


205 
Sy ie 25 2242) ome. 
eae + ogt\ T+ do? Jf + OM, %0), (206) 


respectively. In a similar way we can use these approxi- 
mations to determine the semi-invariants in the case of 
white noise, with the aid of (194) applied to (192a), 
(193). Using (204), and summing the various series in m, 
we get 


De =D) paneaice) aay) (Aon eee ei 
(m = 2) | 


To, rc] = log PS 


(207a) 


Dino =) 2)eeae: Ne, (207b) 

for noise alone and for signal and noise, where d is large, 

1.e., the observation period is comparatively long. For 
= |, the above still applies if we add Cp; see (186a). 


E. Narrow-Band, LRC-Noise Signal (White Noise Back- 
ground and Continuous Sampling) 


Again, white noise of spectral intensity Wos is passed 
through a linear filter, to generate the signal process. The 
filter in question is now of the LRC type, shown in Fig. 
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Fig. 9—An LRC-noise signal process V(t). 


9, and the associated spectrum and covariance function 
of the signal ensemble are respectively 


Woswo 
dopo [1 + (@* — wo)’ /4o°wr] 


Ws(f) = (208a) 


K,(t) = Wse°?""! (cost + =~ sin Or tye ORb) 


where 
Ae WS Ces on = LC: Gp = h/ 2h: 


(RCs 2en/o,); We Waswa son (208c) 


[Note that wr ec, (202), and wr|zrc, (208c), are different 
quantities]. In the high-Q cases we are interested in here, 
(208) becomes 


Ws(f) = Dy si'| 1 + e=)']"; 


Wp 


K s(t) = ¥se °"'** cos wot, (209) 


since Q(= w)/2wr) is now taken to be very large. If we 
insert this expression for ky = Kgsy;' into the various 
iterated kernels (195), etc., we find that only terms in 
the integrand that do not contain cos wf make a significant 
contribution. In fact, we easily establish that, 7 form, 


Be eee 2o aD ane (210) 
where we observe, of course, that \Xxrec = (wr)rcl’ and 
AX = (wr)rercl’ are usually different magnitudes, repre- 


sented in both instances by \, while it is normally clear 
from the context which is meant. Note here that ké(| ¢|)o = 
e °F"! is formally identical with ks(¢) of the RC-noise 
signal discussed in Appendix III-D. 

Modifying (204) accordingly, and repeating the steps 
outlined above for the RC-noise signal, we get in straight- 
forward fashion 


Tiree low NCW Le Aa el eRe) 
(21 1a) 


2 ] f i} ON = 
To,trc = log [io at oe a 20) ete (. age 
0 


do 6 
” a ie 4 ay oes 
720 (; + 65 1+ 0 He CS GON eu) 
respectively for large and small \(= Az ec here). Similarly, 
from (186a), (186b) the semi-invariants are simply, 
Li ire zs 2L,..RC} Lie tee a 20. ke. 2) 


while for m = 1 we have, instead, Li}rg = Ch + 2 
Ce ao DR = Ce 2 TE). Again provided the 
signal is narrow-band, (207) can be used in (212) to give 


(212) 
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the desired semi-invariants directly for the LRC case 
Since o? is proportional to 7, it is often convenient tc 

use another signal-to-noise ratio, called the effectiv 

signal-to-noise (intensity) ratio, defined by 


o, = Ws/Wonwr; 


Pe es 2, 
09 = No.3 


lop = (wr) Re, or (wr)recl- (213) 


Then, for the specific signals of Appendix ITI-D and III-E 
we can compute the semi-invariants and related C’s of 
the Edgeworth expansions (189), in the cases of long 
integration times (A > 5). 


APPENDIX IV 
SoLUTIONS OF THE INTEGRAL EQUATION FOR CONTINUOUS 
SAMPLING, WITH RATIONAL SpEecTRA, IN A WHITE 
NoisrE BAcKGROUND 


A. A General Solution of the Integral Equation 


The integral equation we have to consider here takes 
the general form (for stationary processes) 


[ Kee 


= B iP K s(t — u)V(u) du, Orr ear (214) 


where A and B are real, A > 0, | B| > O. {For our 
particular problem A = Woy/2, B = (Woy/2)’, so that 
A has the dimensions of [amplitude’/frequency], and since 
K, is [amp’], V is [amp], we see that 27 has the dimensions 
[amp ’ freq]. In what follows, however, we shall let A 
and B be constants, not necessarily related by B = A™’, 
but with the appropriate dimensions, as determined by 
(214).} It is now postulated that: 


Deets Vi) = 0 outside (O—, T+); 
2) Kgs(t — u) = Keg(| t — u |) 


N 
= em es Re (b,) > 0. 


This is a form of “lumped-constant” noise, yielding ¢ 
rational spectrum; the covariance function Ks, is real. 
so that the a,’s, 6,’s occur in complex conjugate pairs 
(N=2F2) 

Thus, we may write (214), using the continuity property 
of Ks, as 


[ Ks — Were) — BVGH] du + Ae 


N 


SS AS? eee a is T 
1 
= 40, 0O<t=<T (215) 
N 
SA ee i<0 
S21 


where the A‘* (with the dimensions [ampl’]) are to b 
found from the requirement that z(¢) = 0 outside (0, T). 
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Let us now write 


Te 
(Dp), = i € ey(t) at 

0 

Spy = / e'V(t) dt, (216) 

id 
s() = AE VGe)",  @ = anf; 
< Wos A 7 
s(p/2mt) = io, (Dp) ap) 

= if o-"'K(t) dt, (217a) 


is last by the Wiener-Khintchine theorem, where also 
e inverse relation is 


Tes) = 


: eae 7 e'99 s(p/2ni) oe (217b) 


urrying out the indicated operations for the specific 
mel K s(t), above (215), gives 
ay SOO 
W s(p/Qrt) = 4 2 ere (218) 


iking the Fourier transform of both sides of (215), and 
serving that (for p = 277f) 


Ket — ue? dt=e™ / eK a) dz 


= 9 Ws(p/2mi) (219) 


» obtain in a straightforward way the transformed 
rsion of (215), namely, 
Asa 
B= ;) 


+ BS r(p) yW s(p/2n1) \ (220) 


Ane” 
b, +p. 


(p). = [2A + Ws(p/2ri) | 42 DY ( 


We now observe that there are 2N roots of 24 + 
s(pa.) = 0,k = 1,2 , N, so that writing 


H(p) = (2A + W(p/2mi)]"* (221) 
» have 
| 9 s(p/ 2nd) 5” 
=k) nexp {—p,''} = h(t). (222) 


re the p,, [vn = 1, --- N], are the n roots of H(p) * with 
sitive real parts. That h is a function of | ¢ | follows from 
e fact that the real parts of the poles of Wy , and H(p), 
> symmetrically located on either side of the imaginary 
is. Further, we have assumed here that none of the 
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roots of H(p)~* is multiple, in accordance with 2) above. 
In fact, we may write 


W s(p/2i) = Ly(p)/Py(p) (223a) 
“. H(p) = ®y(p)/{2APn(p) + Xv(p)} = Sx(p)/vu(p) ; 
Vv(p) = 2A®y(p) + Ly(p) = Vy(—p). —-(228b) 
In terms of (218) we have specifically 
W s(p/ 271) (224a) 


N N N 
= 4 diab, T° OF — py / IT. — pO, + d), 
so that 


N Nit 
Xv(p) = 4 >> a,b, [] (7 — p); 
n=1 j=1 


Tl (b; 


n=1 


Py(p) = — py.) (2245) 
Writing c, for p,, (222) et seq., let us define (with attention 


to dimensionality) 


ATL@ —p) =2A I] 2 - 


n=1 


Vp) = 


+ 4 D Andy I (b; —p), (225) 


with c, ~ 6, and Re(¢,) > 0. From (221), (222), (223a); 
(223b), we get, with straightforward integration 


oq ; ny d 
1 eh AN) toe 


N’ 2 2 
Andy I] (2 She t| : (226) 


from which it follows that 


ne B TES. 
At this point we now define 
G(p) = W s(p/2ni) H(p) 
=f Wether dt, Rel) =0, 228) 
[and subsequently continue G(p) analytically]. Taking 


the transform of the last member of (220), we obtain 
Cory ; a dp 
|. GOS: 0" 5. 
=f Vole + eal ¢ ae 


Hi 
= Ht RG taal VG) dls (229) 
0 
Now, since the zeros of H(p)~* are the same as the zeros 
of Wy(p), (223a), (223b), and since the poles of the ex- 
pressions in the parentheses in (220) are the same as the 
corresponding zeros of ®y(p) = [[*., (02 — p’), when 
we take the Fourier transform of both sides of (220), we 
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observe that these latter cancel one another, and the only 
poles of the integrand occur at the zeros of Vy(p), (223b). 

Thus, with the help of (229), using (222), (223), (224b), 
225) we get 


= ioe. pt 2 2 2\— 
= 247 [Bet Tot — pe - 2 


N Agee Ane )} 


N 7: 
+B) hy f Viera edi: 
n=1 0 


zr(0) (230) 


where for noise shaped by physically realizable networks— 
here linear, lumped-constant filters—we require that 
Re(b,), Re(c,) > 0. Notationally, we have also z7 = Vr = 
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I] (b; + ¢;) ee (b; — ¢;) 


(N) lee 
d, = 


I] (Cy -ic,) I] Cs) 
I] @ +e) IL @ -« 


i) = 42 i= -  (282b) 


I] (¢; + Cn) I] (¢; — Gy) 


Now since (23la), (231c) are each identities, the co- 
efficients of e must vanish for each n (= 1, ---, N), 
giving us 2N equations from which to detonate Abe 2N 
unknown constants A‘, (n = 1 , N). Letting 


(232a) 


—cnt Cue 
’ 


z 4 
= [ vee ae; =Bh, [Vee ax 
e 0 


0, @ > T, 2% < 0), while z-@) = 2@); VzG Vide) een Bs 
0 : ; 2 T). Ns 7(t) (2) r(t) (t) b peed ie Ae bite ene Ae (233) 
We are now in a position to evaluate (230). Observing C20 AS C= od? Am 
fort > T that the poles of the integrand are simple and Bs a , 
occur at p = —c,, and that for0 < t < T, (allt < T), the with X, = Ai Vn = Aa We rae REO 2 
poles are at p = c,, while for ¢ < 0, they also occur at from (231a), (231¢) the set of equations 
= ¢,, we find that (230) becomes explicitl 4 b AY, = 
Pp (230) plicitly Gai beer ne es , Great Se ny ales 
N r a+ OLX, + OY, = 0 
GST): 0=B > (ef V(a)ec’, dx)e°™' 
n=1 0 for which the solutions are 
ale 9A} » (AS? dee hype xX ee A? ore OC. == OG. : 
Cn eae n a ) 
Pan 7 by 6b 200: 
“LED Ar i. Abe edincarne (31a), Y,= A. = 2 Ga ee 
n=1 n 
N T provided A, = b/é, — b,¢ # 0, where specificall 
(rt <1) 2 Bee hy f Viaje °°" *! da ere ie SS ‘i 
a eae n=1 0 = b; — a 
a, = 4a TT PED Te, +04 
a 9A- 1 VS ce —cent t i 
3 3 {(b, a Gy e€ ee (b,, ae G) on y (236) 
. : oe 
Lae Ip p~ent with b; ¥ c¢;, b;, c; distinct, all n. From (232a), (232b), 
1 24 > An en @ (23%) (233), (235), and (236) we get directly for the 2N con- 
stantssA ce. vA sa: 
( T T 
J (b, + ¢,) / Viaje’* dx — (b, — ¢,) i Viger@ al 
A,” = F,- ; : 237 
L (b, — ¢,)°e °"” — (b, + ¢,)'e"* 2G 
f T ip 
e°"" (by, + C,) i Viaje °"? dx — e °""(b, — ¢,) ‘| V(ax)eo"* ax | 
Axe? = Bee 2 238 
(b, = eye" = (b, + alo” - 
where 
N Tv 2 —_ 2 b; — Cc 
(G0) 0'= 8B eon [ Viaje"? dx F,, = ABe,h, I @ = %) = = 280.0, I (2=8 “), (239) 
4 9453 x (AM e Menem This, in conjunction with (231b), completes the genera 
solution of (214) and (215), subject to the restriction 
ee that K s(t) possesses a rational spectrum, 7.e., Ks is 
aaa E, Aus ie tO (231c) the covariance function of “lumped-constant”’ noise 
r When the spectral representation (217a) has multiple 
where d‘” and e“” are given by poles, an exactly analogous procedure may be employed 


7 


the results may be obtained from (237)-(239) by a 
table passage to the limit, e.g., for double poles: 


= a a 
bebi (b; ail ) : 


lim 


. Special Forms of the Solution 


Some special cases of (214) are readily obtained when 
is set equal to zero, or when (semi-) infinite observation 
sriods (7’ — ~) are allowed. When A = 0, we rewrite 
w(p) as B* [[™_, (2 — p’), (225), where we are now 
mcerned with the roots, c,, of >>*_, a,b, [[%, (@ — p’) 
- 0. These then lead us to the corresponding version of 
30), from which z(f) is determined as above. A variant 
‘the case (A = 0) is 

5 igs 


K s(t — wetu) du = gf), ©O- <t<T-+) (240) 


4 
here now g(t) is some given function of ¢t in (O—, T+), 
fferentiable to a suitable order. Solutions of (240) in 
1e case of the rational spectra assumed here are available 
sewhere [17], [18]. 

In the limiting situation of semi-infinite observation 
sriods, we find from (237), (238) that 


m 7, lee = Ee Xb,, =r Clas 


vis 
ate ent | Vine ax} Fey = (O41) 
0 


Too 


fa+ viea [vee ae — a -— VIED [Veer ae 
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where the roots of 2A +'W s5(p/2z7), (221), are found from 


ne dts tt) 0 


ven Cyt orp V1 al Yo» (Recie20)- 
Yo = 4Ys/worWon = Wos/Won3 


(ig — 70 


if (245a) 
hk =a V1i+% 
ON 
= (Wes), V1 +75 = yorV1 + 7% 

Won 
From this, fF; = wpy,, and 
Woe, Del Geel pany eine 
; 2c, BVA Aa oa ie 

ei? ics b, a: Cy a 1 a= V il EVO. (245b) 


2c, 2Vi1 + ¥ 


Returning to (231b), remembering that B = 2/Woy, 
we have here finally for the desired solution 


Qyow V 1 SE Vo he Viaje S| dz 
Won 0 


27(t) = 


An = owa| 


AD = 


wre 


here V(t) is assumed to be suitably bounded, for all 
> 0. We have also 


WA.) = FF 3(b, + C,) 


“0 Vigice 00: (<a), (242) 


» that our solution (231b) becomes the Wiener-Hopf 
lation 


x a b, —c 
—cn|t—z| n n 
i) = B dX he i Viaje da (i a e) 


fp te il Vaer™ ars, 40 (243) 
0 


0 Se 2 


. RC-Noise Signal 

For an RC-noise signal [Fig. 8 and (202)] we find that 

bs a, = a; B= Chi = 4V ser) W ons 
hy = 2a,b,/ Ac =s AW san / W ont, (244) 


by = Wry 


ees ee 
+ 7 ( a a + ) (246) 
AA eA el Ona 
where now specifically 
(247a) 
CON EL NS Ge 
f fi phan : Jabot 
Ja+ Vitae? [ v@ee as -a- VIF%) [ V@re? ae 
: oS 2 (247b) 
(lL— V1l+ ye" — (1+ V1 +)" | 


For semi-infinite observation periods, (246) reduces to the 
Wiener-Hopf result 


zal) = BL TER f V@e de 


oe V1i+% 


Sa he “aah > 0. (248) 
ele ay =) i som bad ; 


D. LRC-Noise Signal 


Here the relations of (208) apply, so that we can write 


Wall) = Gr 
+0 pa) 
by = Wp 810; bp = 104 to =, ee tore (249) 
by een by = 2 es iby reas 
bo + B2 = 202 — 3 
bi — bs = —4iwwr 


yo IRE TRANSACTIONS ON 
and 
K s(t) = ae"! + ae"; 
a Wale Wis 
1 Sb 8 a 


The roots of H(p)~*, (221), are found from (225) with 
the aid of the above as solution of 


p — 2wr — wip’ + w(1 + yo) = 0. 
We get finally 


= = 2 2\1/4 * 4 =] :) = T 
Die = Cis (a +b)” exp = E u Al 


Re (,¢2 > 0), (252) 
w; — wp (> 0 here); 


ae ee 


with (a? + b°)”* = w(l + yo). 


(251) 


where a 


j= 


w/2u7, 


For the high-Q cases, 7.e., narrow-band signals, we have 


Wo(l + v0)" 
“exp ie (ian as Nisan lees Neel VE ale 
(exact) 


Ces 0-) To 

(253a) 
= wo(1 + 73)'”* exp ee (tan A/ We *)\ (253b) 
Applying the above to (227), (239), we get 


QwoWs GC ms “1, 


CC, = 


P, 


- 5 UW ono b; 
20 s (2 = “i 
of = > ¢ 
tbe OW one1Cy C5 mes? 5 4 a) 


and since c¥ = c, we get finally 


= by — ¢; 8wows Zi —eilt-2l 7, 
Zn he {(e a “) | f V(a)e dx 


Wenwreyt 
+ (Bb, — ed) {( erie 8 P+ eat (255) 


where ( ){*}-(_){>) are respectively the quantities that 


are coefficients of F, in (237), (238), when n = 1. For the 
semi-infinite observation periods we get the Wiener-Hopf 


solution (¢ > 0) 
Ais Gh ‘i Wi —¢e1|t—2| 
= “) | f Viaje dx 


PARE (¥ 
+ (Ree) f° vee ae} 


E. A Related Integral Equation 


When the background noise is white, we can obtain an 
alternative representation of the detector structure which 
does not involve the received data V(t) explicitly in the 
resulting integral equation. In matrix form the structure 


2 
SwoWs 
2 5 
Wonoiei? 


(256) 
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term is (Section I-D) where for the moment we regar 
yy as finite (except when passing to the hmit n > © o 
—- © ) 


&, = vy VGyyV, with C = Ay — (Ay + Ks)” 


i. | = [Wy 55x]. 


If we write Z = WyCV, then ®, = Wy'VZ (48) and passag 
to the limit (n — ~) for the continuous cases now gives 


(257 


: i 
lim &, = by; = re lim >» = V(t, Z(t) 
2 da 
Se V(OZ(t) dt. 258 
wf VOR dt. ( 
At this point let us write 
wwC = 1 — (1+ agks)* = [pr(t:, t,) Ad], (259 


where pr(t, t;) = pr(t;, t;), smee C is a symmetricg 
matrix. Note, moreover, that we can also write 


pr(tk, t;) ra pr(t; =U5 t,) a prt; ese t;), 


which is a form, as we noted in Section V-A, that has : 
particularly useful physical interpretation. Consequently 
the relation ~yyCV = Z becomes 


(260 


Z,= by >, Ca Vi = D_ Ato, —hytjVn Come 
k=1 k=1 
and in the limit n — «, At — dt, we get formally 
ob 
Z(t) = / Vihori — U,) di ~ © S10) noone 
0 


where we have assumed that p; and V possess suitab] 
properties of continuity in the region (0 < ¢#,  < T) 
Insertion of (261b) into (259) then yields 


Op = 2 I Vit) ert, ty) V(t.) dt, dt, 
Won 
0 


Ges GH 
ee / Vit) dt, / VGdort = be) de 
Won 0 0 
for the structure factor when continuous sampling is usec 
Our next step is to find the integral equations whic! 
govern pr, and Z(t) in (261). Starting with p,- first, let u 
multiply both sides of the relation for C by Yy(Ay + Ks 
and observe that the zkth element of wyC(Ks + Ay) = 
pvAn Ks is 


PS, oC) (Ks + 


Using (261a) and passing to the limit, we get directly 


Wow n 
200k 


3. = (Ks) x. (263 


e | Kt. u) +7 ity Oa 0 Jona u) du 


am K s(t, ae (0 < tT < 1) (264 


which is the desired integral equation for p,, and is 
special case of a somewhat more general expressio 
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yntained by Price in his treatment of the reception of 
atter-path noise.*® 

With the help of (261b) it is now a simple matter to 
rive the integral equation for Z(t). To do this, we 
ultiply both members of (264) by V(r) and integrate 
ver 7, to get 


: Ks(t, off o(r, u) Vir) dr) du 


: 
oe lr, 0VG) dr = 


sing (260) and (261b) this becomes at once 


.T 


a 7 Vix)K g(t, 1) dr. (265) 


ee VATE Ho Z(t) 


= (hs Vinkst, dr, O©<t< DB. (266) 


or stationary processes we see now that (266) and 
214) are identical, provided 


Zi) = 72 anld), 


OLS iaT)s (267) 
nd so our solutions for z;(¢) in Appendix IV-A, exeept 
wr a scale factor, are also the solutions for Z(t), under 
ne present assumption of rational spectra and white 
ackground noise. 

The solution of the integral equation for py may be ob- 
ained by direct application of the method described in 
etail above for z7(t). However, since pz and 2,7 (or Z) are 
nearly related by (261b), we can obtain the desired rela- 
ions directly by comparing (231b) and (261b). The 
esult in the present case is, for (0 < ¢, t’ < T), 


N N 
Pg Sane eee 
n=1 n=1 


N 
+20 BO Were" — (268) 
n=1 
‘here 


(b, ate Gee. aT (bn 70 Ceo 
Weep P - =, 
n (t’) ai FP; Ze Ds he (b, ae Open 


ee) == Pye a os) ae Coe aod is an Gy \ 
(b, — ¢n)'e°*” — (Dn + Gn)'e"" 


(269a) 


(269b) 


‘rom this it is clear that we can write pr(f, t’) = pr(t — 
,t) = pr(t’ — t, t’) by suitable adjustment of the ex- 
onential terms. For the (semi-) infinite observation 
mes (7’ > ), the corresponding Wiener-Hopf solution 


( 
) 


(t a) a7 Si h Ta aa ea = & Te Ba, 
ao\") =p eo n b, ays 


= 1G St 1). Cee (270) 


43 See Bibliography [4], (54). 
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Eqs. (268), (270) may be specialized immediately to the 
RC and LRC spectra of Appendix IV-C and IV-D. 

Finally, we observe that the alternative treatment of 
narrow-band signals in white noise (Section VII) 
introduces no serious modifications of the analysis: one 
simply replaces V(t) by V.(¢), V,(¢) respectively, and 
Kg with (Ks)o, m (214). For the high-Q LRC case, 
(K s)o is in form identical with (Ks)pc, so that the results 
of Appendix IV-C above may be used directly, with 
V., V,, to give the corresponding z7(¢),, 27(¢),. For pr in 
the alternative treatment, the only modification is the 
replacement of Ks by (Ks)o. 


APPENDIX V 


A. Matched Filters 


As defined above in Appendix IV-E, the matched 
julter is a linear time-invariant operator, preceding a 
zero-memory nonlinear element, such that the average 
cost of decision is minimized. Here we shall outline 
briefly the argument by which such matched, linear, 
predetection filters may be found, without, however, at- 
tempting to give explicit solutions in the present study. 

As before, we begin with a matrix formulation and let 
Q be an (n X n) matrix (with an inverse), 7.e., det Q # 0, 
representing the matched-filter operation in the discrete 
case. Then 


Vr = QV (271) 
is the filtered wave, when V is the input data. The quad- 
ratic form ®, (257), characteristic of the present class of 
noise-in-noise problems, can now be written 


&, = vy'VCdnV = vy'(VQ)[O'CyxQ‘JOV 


= py'VrAVer, (272) 
where A = Qyx'CQ~* is dimensionless, and C is assumed 
to be symmetric. We remark that, for the moment, C 
need not be determined by the condition for optimum 
reception, but can represent any suboptimum receiver 
structure of the above quadratic type. 

Our constraint of a zero-memory, nonlinear operation 
following the matched filter (in fact, part of the definition 
of matched filtering) means that the receiver structure 
becomes 


Cel NGA (273) 
and this in turn requires that A be the unit matrix I, 7.e., 


Or CyyOne =F; (274) 


thus, C is diagonalized by a congruent transformation. 


Since C is symmetric, this is certainly possible, although 
: ° . : . . 4 
the diagonalization is not necessarily unique.” 


44 See, for example, Bibliography [12], ch. (10). 
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Inasmuch as Q is a (discrete) linear, time-invariant 
filtering operation, it will have the form 


Qe [Ap Hija ye ==; 


ng = iL ener os (275) 


where h is the weighting function of this filter. At this 
point we distinguish between physically realizable and 
nonrealizable operators: for the former the added con- 
dition that h() vanish, all t < 0, is imposed, while for the 
latter, h # 0, ¢ < 0, in general. Thus, if Q is to represent 
a physically realizable, linear filtering operation, we have 


Qz = [h(t; os E)pAtl; hp = 0, 


ae lA Cg ae Oy 70) (276) 


The general relation determining Q in either case is found 
from (274), viz., 


VRC OO) (277) 


which are a set of simultaneous, nonlinear equations for 
the Q;;. 

In the discrete case, (274) gives WyC = Q2Qz, and with 
WyC = [pr(t:, t) eA], cf. (260), this relation becomes 
specifically 


pr(t, te) R aa oS, hit; 3 ts) ed(t; mi t.) eAt, 


(Cag oie So i (278) 


The continuous operations are of chief interest. Passing 
to the limit, we get from (278) the basic nonlinear integral 
equation 


in @ eS [ h(x — theh(a — T)p dx, 


OS% 757), (279) 


corresponding to (278), where it is assumed that (pr) z is 
given. For the Bayes, or optimum system discussed in the 
text, C has the specific form (244), and (p7)z is then the 
solution of (265), so that hz — (hr)opt. in (279). The 
condition of physical realizability, of course, means that 
h(e — thy = 0, 2 — t < O, etc. [This is reflected in the 
fact that we leave the receiver’s structure 7 unaltered, 
if we set pr = O fort — 7 < 0, (see Section V-A.] Observe 
from (273), with yy = (2/W,)T/n, that in the limit 
(n— ~), or T/n; = At — dt, we get 


D z 
be = 7 ih V,(t)2 dt, (280) 


with 
OH 

eG / Viiyht thd’, (<t< 7), (281) 
0 


from (271), (275), for continuous sampling. Inserting 
(281) into (280) and using (279), we have alternatively 


TT 
2 
o, = [] Vidor, eV) dt dt, (282) 
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and in the case of optimum systems, this is identical with 
(263), where pr is obtained from (265), and hz — (hr)ovt. 

For the C’s or p7’s, that have their counterparts in a 
physical process (implying suitable continuity properties 
on p and h), we expect that (279) possesses a meaningful 
solution, since the diagonalization of the symmetric 
matrix C from which (279) follows in the limit is always 
possible with a congruent transformation. The reverse 
situation, where h(t), is specified beforehand as one 
condition of the problem, presents no difficulties: the 
nonlinear device is still square-law with zero memory, 
but the receiver structure is no longer in general optimum, 
since pr does not then correspond to the proper weighting 
C = Ay’ — (Ay + Ks) (for white noise backgrounds) 
of the optimum decision system in the discrete case, nor 
is pr a solution of the corresponding (265) for continuous 
operation in (0, 7). 

The solution of our fundamental nonlinear (279) 
can be achieved as follows: we observe first that p7(t, 7) r 
is required to vanish outside the square (0 < t, 7 < T), 
from (70), since only operations on data in (0, 7) can 
influence the (binary) decision process; (V(t) is assumed 
to be available for all -~o < t< o). Thus, we have 


solution of (70), (0 $47 <7)|. 
t, routside (0<it,7 < T) 


prt, t)r = 


Taking the double Fourier transform of both members of 
(279) then yields directly 


(283) 


ue 
P,(is)= ff on(t, ngete-? di dr er (eV Gane ae 
10) 
from which it follows at once that 
: 1/2 
PAG | = ro fh prltat) pen) mee ar} . (285) 
10) 


That the double integral is a real quantity (pr real, of 
course), may be alternatively demonstrated with the help 
of Mercer’s theorem,*’ if we note from (260) that pr is 
symmetrical in its arguments. Furthermore, since p7(t, 7) 
is assumed positive definite,*” we can expand it in the 
bilinear form (Mercer’s theorem) 

prt, Tr = 2d AnOn( n(7) (286) 
where the X, are all consequently real and positive (some 
may be zero). These X, are, of course, the eigenvalues of 
the integral equation 
(O Sie 


[ px(t, rabs(2) dr = dub.(0); (287) 


v0 
Taking the double Fourier transform again, with (286) 
for pr(t, T)r, yields 


“© Bibliography [6], (16). From physical considerations (our 
system is an energy detector, essentially), it is evident that ®7, (73) 
can never be negative. 
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co fh a 
Gore i| Cena ae / $,(te'** dt 
n=0 0 (0) 
=e N, |ta,@o)nl =, 0, (288) 
ith a,(iw) = J>¢,(t) exp (—iwt)dt, and this immediately 


stablishes the positive, real nature of P;(iw). Thus, in 
Jace of the exponent in (285) we equally well use 
Os w(t — 7). 
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The Relationship of Sequential Filter Theory 
to Information Theory and Its Application to the 
Detection of Signals in Noise by Bernoulli Trials’ 


HERMAN BLASBALGT 


Summary—tIn this paper the problem of detecting signals in 
noise by the method of sequential filtering is formulated. A slicing 
operator for converting a given random variable into a Bernoulli 
random variable is defined. A method for choosing an optimum 
slicing operator in a certain prescribed sense is given. It is also 
shown that the Bernoulli sequential test is defined by three param- 
eters a (po, fi, a, B), b( fo, pi, a, B), and c(fo, pi) rather than four, 
as one would normally expect. The significance of these transforma- 
tions is briefly discussed. Finally, the theory of Bernoulli sequential 
detection is applied to the detection of a sine-wave carrier in noise 
when the signal-to-noise ratio is less than one. The efficiency of 
this detector is calculated and compared with the results of others. 
Curves of the significant Bernoulli sequential detector characteris- 
tics are given for this problem. 


J. FoRMULATION OF THE DETECTION PROBLEM 


ET E, and E, be two mutually exclusive events 
L, whose occurrence an observer wants to detect. 

Hence, if the event HE, occurs, then EH, cannot 
logically occur, and if #, occurs, then EF, cannot occur. 
It is known, however, that the probability is unity that 
one of these will be present when the observation is 
started. The observer knows a priori that the event E> 
produces one of a possible number of effects fo(¢) which 
is a member of the set S), and H, produces one of a 
possible number of effects f,(¢) which is a member of the 
set S,. We assume that the sets S, and S, are disjoint. 
In the problem considered it is assumed that the informa- 
tion functions fo(é) and f,(é) can take on only two values, 
“zero” or “one,” during different time intervals. This can 
be realized by a slicing operation. These functions are 
sampled so that the zeros and ones which result are statis- 
tically independent. If p is the probability of a ‘“‘one’”’ when 
a member of S, or S, is observed, then S, contains all 
fo(t) for which p < po, and the set S, contains all f,(¢) 
for which p > p, such that p, > po. If the results of an 
observation indicate that p < po, then the observed 
information function belongs to S) and the hypothesis 
H, is accepted that the event Hy occurred. Similarly, if 
the results indicate that p > p,, then the observed in- 
formation function belongs to S, and the hypothesis H, 
is accepted (or H, is rejected) that H, occurred. 


* Manuscript received by the PGIT, November 23, 1956. This 
paper was presented at IRE-WESCON, Los Angeles, Calif.; August 
21-24, 1956. Research reported here was supported by the U.S. Air 
Force through the Office of Sci. Res. Air Res. and Dev. Command. 
This paper represents a portion of the Doctor of Engineering 
thesis presented to The Johns Hopkins University, Baltimore, Md.; 
May, 1956. 

+ Electronic Communications, Inc., Baltimore, Md. Formerly 
with The Johns Hopkins Univ. Rad. Lab., Baltimore, Md. 


Since this is a two valued decision problem, there are 
two types of errors possible. For example, it is possible 
to accept H, when, in fact, H, is true or to accept Ho 
when, in fact, H, is true. Let a be the probability of 
accepting H, (signal-plus-noise is present) when AH, 
(only noise is present) is true and 1 — a is the probability 
of accepting Hy when H, is true. Also 8 is the probability 
of accepting H, (only noise is present) when, in fact, H, 
(signal-plus-noise is present) is true and 1 — @ is the 
probability of accepting H, when H, is true. 

In sequential detection, the problem considered here, 
the observer specifies the error probabilities a, 8 along 
with the threshold parameters p) and p,. A criterion must 
now be established for choosing an optimum procedure. 


Il. Brier InrrRopDuCcTION To THE THEORY OF 
SEQUENTIAL DETECTION 


In this section the highlights of sequential detection 
theory will be outlined. The equations presented here will 
then be specialized to Bernoulli detection. For the math- 
ematical details the reader is referred to the literature 
(Lett 

A sequential test can be described as follows: at ¢ = t, 
the observer measures the value x,. Based on this value 
it must be decided whether to accept the hypothesis Ho 
that the parameter of the distribution P(#, 0) is @ < 4, 
or H, that 6 > 6,, or whether the datum is insufficient 
to accept either one of the hypotheses with confidence. 
(For the Bernoulli case we will use the symbol p instead 
of 6.) If H, or H, is accepted, experimentation is termi- 
nated. On the other hand, if the single datum is insufficient 
to lead to the acceptance of one of these hypotheses, then 
at t = t,, the observer takes another sample z,. Based 
on the sample of size two (a, x2) the observer must once 
again make one of three possible decisions: accept Ho or 
H,, or the datum is insufficient for accepting either one. 
If H, or H, is accepted, the experiment is terminated. 
If the data are insufficient for a terminating decision, 2; 
is observed. The same decision procedure is then repeated 
on the sample point (2, a, 23). Experimentation is 
continued in this manner until either Hy or H, is accepted. 
The number of samples required for the termination of a 
sequential test is a random variable. From the class of all 
sequential experiments of strength a, 6 and threshold 
parameters 6) and 6, the decision rule which minimizes 
the average number of samples required for termination 
at these parameters is chosen. The optimum decision 
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yrocedure for independent sampling requires the accept- 
nce of the hypothesis Hy at that value of n for which, 


a(n) = > log luna ein 


© P(a,; 6) = el —a o 
und H, is accepted at that value of n for which, 
- aay) 1—8 

un) = 2 log BG ST Le ay (2) 


where, 
P(x, 6) = probability density function of the random 
variable « when 6 is the true parameter, 
a = probability of accepting the hypothesis H, 
that 0 > 6, when, in fact, Hy or 0 < @ is true, 
68 = probability of accepting the hypothesis H, 
that 6 < 6) when, in fact, H, or 0 > 6, is true, 
1 — a = probability of accepting H, when H, is, in 
fact, true, 
1 — 6 = probability of accepting H, when H, is, in 


fact, true. 
Also, 6, is always understood to be greater than 6. 

In a filtering process there exist certain mathematical 
functions which are used as criteria for judging the per- 
formance of the filter. There are two such primary 
characteristics in sequential detection theory, although 
others can be used to supplement these. The two most 
important characteristics for judging sequential detection 
performance are the operating characteristic function 
called the OC function, and the average sample number 
function, called the ASN function. 

The OC function L(6), gives the conditional probability 
of accepting H, when @ is the true parameter of the dis- 
tribution. It is a strictly decreasing function of @ which 
takes on the values 1 — a at the threshold parameter 
9 = 6 and 6 at the threshold parameter 6 = 0,. This 
function gives the confidence with which H, is accepted 
as a function of the parameter 6. For independent samp- 
ling, the mathematical expression is given to a very good 


approximation by the set of parametric equations, 
E, le] = 1, (3) 
and 
le h 
alee. 
L(h) = ; —-o <h<o (4) 
( ) fe =e ay - ( B i 
a iL == @ 
Pee) 
= oo 5 
fa 2 


At the point h = 0 which corresponds to the point 6 = 6’ 
at which E,(z) = 0, L(h) is indeterminate. By an appli- 
cation of L’Hospital’s Rule or some equivalent it is 


found that, 
a 


iog (12) + tog (154) 


(6) 


L(6’) = aU ONee0! == Ok 
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Eq. (3) gives the expected value of e*” on condition 0. 
The function L(h) is a universal curve for all sequential 
tests of strength a, 6. Eq. (3) relates this curve to the 
statistics of the random variable being measured. For a 
given value of h, a corresponding value of 6 is obtained 
from (3). For the same value of h, a corresponding value 
L(h) is obtained from (4). This gives the point [L(@), 6] 
on the OC function. The procedure is repeated for all h. 
At the threshold parameter 6,,h = 1 and at 0,,h = —1. 
The OC function given by (3) and (4) is not exact, since in 
the development it was assumed that experimentation 
always terminates when the equality signs in (1) and (2) 
are satisfied exactly. This, however, is not true for dis- 
crete sampling. The excess over the boundaries is neglected 
which introduces errors that are of no practical importance. 
The author has verified this experimentally [4]. 

The second important characteristic, the ASN function, 
gives the average number of samples required for a 
sequential test to terminate when @ is the true parameter. 
The mathematical expression for the ASN function is given 
by 


18) log —2— + [1 — 1(0)] log —* 

Ei,(n) = E,(z) (7) 

where, 

E,(n) = average number of samples required for a 
sequential test to terminate when @ is the true 
parameter, 

E,(z) = average value of the random variable which 


is measured on condition that @ is the true 
parameter. [z is defined in (5).] 


At the indeterminate point 6 = 6’, the ASN function 
takes on (for all practical purposes) its maximum value 


given by 
OF oars 08 3 | 


Ey’) ; 


Hen) = O50) <0 ee) 
Since L(@) is an approximation, it follows that E,(n) is 
only approximate. Once again the error introduced by 
neglecting the excess over the boundaries upon termi- 
nation of the experiment is generally of no practical 
importance. At the threshold parameters 4, L(4) 

1 — a@ hence 


(L 1@) logs = + a log +—* 
By,(n) = a (9) 
Similarly, at 6 = 6,, L(6,) = 8 and, 
B log —2— + (1 — 6) log —* 
Ey,(n) = ae (10) 
We define the function H(z) in (7) as the “average 


information per sample gained from measurement” when 
the data come from a distribution whose parameter is 6. 
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The properties of (11) as a measure of information, have 
been studied in detail [6, 8, 9]. For given a, B, the average 
number of samples at 6) and 6, decreases with increase in 
the average information per sample gained from measure- 
ment. Eq. (9) and (10) relate in a very simple and compact 
manner the three fundamental properties which character- 
ize a two value decision problem; the average number of 
samples, the (a, 8) errors, and the average information 
per sample gained from measurement. 

The average amount of information from observation 
required for a sequential test to terminate when 4, is 
true for specified error probabilities a, 8B is given by the 
numerator of (9). The numerator of (10) gives the average 
information required for termination when 6, is the true 
parameter for specified a, 6. 

The function #,(z) will play a very important part in 
optimizing a Bernoulli detector. The mathematical ex- 
pression for /,(z) is given by 

P(x; 61) 


E,(z) = ie P(@; 6) log Bie: 6) dx. 


(11) 
III. BernouLui SEQUENTIAL DETECTION 
CHARACTERISTICS 


A. Introduction 


In many problems of interest, the optimum detector 
characteristic required by sequential filter theory cannot 
be realized either due to the fact that the statistics of the 
signal are not completely known a priori or due to certain 
practical constraints imposed upon the observation. 
Under such circumstances, it would still be desirable to 
synthesize a sequential filter whose performance is 
sufficiently good when compared to the best filter given 
by the theory, for example, when all the a priorz statistics 
are used. That is, for the same error probabilities (a, 8) 
and threshold parameters 6, 6, we would like a filter such 
that the average number of samples required for termi- 
nation is comparable to the optimum in the operating 
region of interest. One such filter is the Bernoulli sequential 
detector [2, 4, 12] which will now be discussed. 


B. Formulation of Sequential Detection by Bernoulli 


Trials 

Consider a sequence of independent sample values 
x, --: x, taken from a random variable belonging to a 
one parameter family of probability density functions 
P(a, 6). On each sample value we define the operation R 
such that 


R,.[%i] = 1i(%o) (12) 


where 


r(%o) = 1 when 2; > 2% 


= () when, 2. < Xo. 


The operation &,, on each of the samples of the sequence 
generates a new sequence of independent samples, 7, -- + 7,. 
The samples in the new sequence can only take on the 
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values zero or one. The probability of choosing k, ones 
and n — k, zeroes from such a sample of size n is given 
by the familiar Bernoulli distribution 


n! 
kin = 


Palins 0) = Pay co) eco) 


k= 0, 1,2, ee (13) 


where, p(2%) = p(x > 2X) = the probability of a ‘“‘one” 
at the output of slicing operator F,,, or the probability 
that the random variable x, before slicing, exceeds 2p. 
It should be clear that, by varying the slicing threshold 
vo, a family of Bernoulli random variables is generated 
and hence a family of Bernoulli distributions whose 
parameter p depends on the threshold 2p. 

Let us now assume that the input to the operator R,, 
is either noise, whose probability density function is 
P(x; 6), or signal-plus-noise whose density is P(x; 4;). 
The cumulative distribution of x on condition 6 is given by 


> 


wae, 6) = | P(e, 6) de. (14) 
For a given value of 6, p(%o, @) is a function of the slicing 
threshold xo. For a given value of 2, p(a@o, 8) depends on 
6. Without introducing confusion, for convenience, we 
replace p(Xo, 6) by p(x, 8) where z is an arbitrary slicing 
threshold. We assume that p(x, @) belongs to the class of 
one parameter family of distributions such that for any 
given slicing threshold x, p(x, 0) is strictly monotonic in 
6. This includes the well known Gaussian and Rayleigh 
families as well as others that are of physical interest. 
Corresponding to a test of hypothesis 6 < 4 against 
6 > 6,, we obtain an equivalent Bernoulli test 


p(x, 0) = p(x, Oo) Po, 
p(x, 0) > pla, 0) = pr. 


I 


(9) 


Xo 


Fig. 1—Slicing operator R.,; R:,[2] 


Input x 


A \V 


ie x Zo 
(Oh an <tr 

Fig. | illustrates the slicing characteristic of the operator 
k,,. The operator converts any input function into a two 
valued output function. 

Fig. 2 illustrates the cumulative distribution function 
p(x, 8) of the random variable x for two parameters 
6, 6, of the probability density p(x, 6). The figure illus- 
trates the properties of the class of distributions con- 
sidered. For any slicing threshold 2, p(%o, 0) = p(x > 2; 8) 


wa P(X, ©) 


x —— 


‘ig. 2—Probability distribution of random variable « for param- 
! eters 80, 61, (A; > Oo). 


nereases with increase in the parameter 6. As an example, 
f 6, and @, represent the signal-to-noise ratio of a signal 
‘¢mbedded in noise, then Fig. 2 shows that the greater the 
ignal-to-noise ratio the greater the probability that the 
andom variable x will exceed the threshold 2,. This is 
rue for any 2. 


‘e) | 


fig. 3—Probability distributions of random variable x for param- 
eters 00, 6; after slicing by operator Rz,. 


Fig. 3 is the distribution function of the random variable 
- after being sliced by the operator #,, of Fig. 1. Since the 
yperator converts x into a random variable which takes 
m the values zero and one, the two valued variable has a 
liscontinuous distribution function as shown in Fig. 3. 
t takes on the value zero with probability 1 — p(a, 8) 
und the value unity with probability p(x, 6). This is 
quivalent to saying that before slicing p(w, 6), is the 
wrobability that x is greater than or equal to x, while 
_— p(%o, @) is the probability that x < x. The probability 
f obtaining k,, ones and n — k, zeroes in n independent 
amples is given by the convolution of n distributions of 
he form shown in Fig. 2. This, of course, leads to the 
vell-known Bernoulli distribution. 


Y. Bernoulli Detection Characteristics 


Statistical Decision Regions: From (13) the logarithm 
f the probability ratio for the Bernoulli detector is 
‘iven by 


2 (Teresi stress res 
a, a) = | Jog ot log ay Bs + n log gat pn (15) 


vhere for given values of 4) and 6,, po and p, are functions 
f x, As previously shown, the Bernoulli random variable 
f probability ratio z(n, x) is obtained by slicing the 
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random variable of density p(x, @) at the threshold z. 
The optimum decision regions are obtained from (1), 
(2), and (15) as 


a(n, x) < log ne accept H, (16a) 
2(n, x) = log A = B ; accept H, (16b) 
a, B <3. 


For any value of n for which one of the inequalities is 
satisfied first, the corresponding hypothesis is accepted. 
Inequalities (16) in conjunction with (15) can be ex- 
pressed as 


log" 
ie Pp 7 p 
Pi 0 
los cee Fs 
OB estes 
a Neer pees H, (1% 
i, Q 
log Ue Sees 
log ea 
ee i 5 
Pi 0 
log numcen eae 
Ieee 
ieee seen (18) 
log ] 0 
Oa oe on 


The arrow indicates the hypothesis to which the accept- 
ance region belongs. 
Let us now define the following three transformations: 


log —— 2 
OL = ; (19) 
=i 
log Lie Ds 
log i 2 
b = a (20) 
log ie) 
Ik = 9Dp 
log re) 
c= —B. (1 
] 0 
og ieee 


Substituting these into (17) and (18) yields the following 
decision regions 


b n 

Fe ee (22) 
a nN 

(es Same pale pose eh (23) 


It is seen that the a, b, c transformations completely 
define the sequential Bernoulli detector. Hence, when the 
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observer specifies the parameter point (po, pi, a, 6), he 
completely specifies the a, b, c, transformations. However, 
an infinity of such parameter points exists which specify 
the same a, b, c transformation and hence, the same 
sequential test. Furthermore, since the sequential test is 
optimum [1] at the parameter point (po, pi, a, 8) and 
since there exists an infinity of such points which define 
the same test, it follows that a given sequential Bernoulli 
experiment as defined by the set a, b, c is optimum at an 
infinity of parameter points which satisfy the same a, 
b, c. Fig. 4 is a curve of (21), forc = 4 and ¢ = 8. It is 
seen that an infinity of points (po, pi) exists which lead 
to the same c. For a given (a, b) the corresponding values 
of a, 8 can be obtained from (19) and (20). 


D. Performance Characteristics of a Bernoulli Detector 


The Operating Characteristic Function (OC Function): 
The OC function discussed in Part IT gives the probability 


log 2 
p= 0 
Il = 4D 
lo 
4 — p, 


for Bernoulli sequential filter. 


of accepting H, when p is the true parameter. Eq. (4) is 
the same for all sequential tests of strength a, 8. Hence, 


h 
ached 
a 

Gre) 

a Io. 
where L,(h) indicates that the Bernoulli case is con- 
sidered. It is now necessary to apply (3) to the Bernoulli 
case. The random variable z takes on two values, log p,;/po 


with probability p and log (1 — p,/1 — po) with prob- 
ability 1 — p. Hence, from (3), or, 


Lh) = 


(24) 
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E,le"] = p exp {[n log »: |} 
+ (1 — p) exp {[n log 


ae ee § LB) 
Ele] = o(®) at Dl} SrA acl 


= 
=] 
il —f Oo) 


Solving for p, we obtain 


easy 
Po We 7) 

where — © <h < o. From (24) and (25), the OC function 
L(p) can be obtained. If it is desired to plot L,(@), it is 


only necessary to obtain p(@) corresponding to some 
slicing threshold x, where 


ph) = (25) 


pz, ) = | Ply, 8 dy. (26) 
For given values of a, 8, points on L,(h) are computed. 
For the same values of h, points on p(h) are obtained. 
This results in the function L(p). From (26), p is obtained 
as a function of 6. The OC function L,(@) is then obtained 
from L(p) and p(@). Corresponding to the parameters 
6, and 6, and some slicing threshold x, we have the 
Bernoulli parameters po = p(X, 40), pi = p(x, 4,). At 


the threshold parameters 6, 6,, h(@.) = 1, h(@,:) = —1. 
Hence, 

L,,(6:) = L(6:) = 8, 
and 

L,,(00) = L(@) = 1 — a. 


It is seen that the OC function L,(6) of the Bernoulli 
test corresponding to some slicing threshold x = 2x, has 
the same value at 9, 6, as the OC function L(@), where 
L(@) corresponds to the random variable 2, before slicing, 
whose probability density is P(a, 6). 

Let us now obtain the OC function in terms of the a, 6, 
c transformations previously defined. If we solve (19) 
and (20) for log (1 — B/a) and log (6/1 — a) and introduce 
the transformation 


a) its Pe) = 
u = exp {h log ¢ a \ (27) 
we obtain 
u— 1 
L,(u) = Vee ao (28) 


From (27), it is clear that as h ranges over the entire 
real line, w ranges only over the positive half of the real 
line. At the point w = 1 (corresponding to h = 0), an 
application of L’Hospital’s Rule yields 


Cae. 
a—b 


Ld) = (29) 
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Note from (20) that b < 0, since a < 0.5, B < 0.5, po < p,.] 

It is now necessary to express (25) in terms of the w 
ariable. This can be easily accomplished by solving (21) 
r pi/po, Substituting this result into (25), and then 
troducing the change of variable given by (27). This 
ields 


Tig Al 
DUN yaad (30) 

t the indeterminate point u = 1, 
pl) = ts. (31) 

e+ 1 


(p) can be obtained from (28) and (30). Furthermore, 
(x, 8) is given by (26). The function L,(@) can now be 
btained for some slicing threshold <. 

The Average Sample Number Function (ASN Function): 
ip to this point we have shown that a random variable 
of probability density function P(a, @) can be converted, 
y means of a slicing operator R,, into a random variable 
nat can take on two values one and zero, with prob- 
bilities p(x, 6) and 1 — p(a, @). The function p(z, @) 
ives the probability that the random variable takes on 

value greater than or equal to x, when @ is the true 
arameter. For each slicing threshold « and for some 
arameter 4, a two valued random variable is realized 
‘hose distribution depends on z. 

We have also shown that, corresponding to a sequential 
est defined by the parameter point (6, 6:, a, 8) for the 
andom variable before slicing, there exists a Bernoulli 
equential test defined by {p(x, 4), p(w, 61), a, 8} for the 
andom variable after slicing. The Bernoulli test is clearly 

function of the slicing threshold zx. However, at the 
hreshold parameters 4, 6,, the Bernoulli test can have 
he same strength a, 8, as the random variable before 
icing. It should be clear that the slicing operator LF, 
estroys information which is useful in the decision 
rocess. Thus for the same (6, 6:, a, 8) we expect the 
sernoulli test to require, on the average, more samples 
han the corresponding test before slicing by the operator 
», The ASN functions at the points (4, 6:, a, 8) can be 
sed to compare the two tests. The ASN function of the 
sernoulli test of {p(x, 00), p(x, 81), a, B} will be a function 
f the slicing threshold x. It is desirable to choose the 
nreshold x or the operator R, for a given a, B in such a 
ray that the ASN function is a minimum at either 6) or 
1, or perhaps at the point 6 = 6’, 0. < 6’ < 6,, where the 
SSN function is a maximum. It is this phase of the 
etection problem which will now be considered. 

In order to obtain the ASN function for the Bernoulli 
ase, it is necessary to make use of the general expression 
iven in (7). The numerator remains the same except 
1at, for the Bernoulli case, L(@) is replaced by L,(6), 
here L,(@) is obtained by making use of (24)-(26), 
s outlined in the previous section on the OC function. 
n order to obtain £,(z) for the Bernoulli case, it is 
ecessary to recall that z takes on only two values, log 
»:/po) with probability p, and log (1 — p,/1 — po) with 
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probability 1 — p. It is also necessary to remember that 
the Bernoulli parameters po, p,, and p~p are functions of the 
slicing threshold x and the parameter 6. Hence, the 
average value of z on condition p = p(z, @) is given by 


ee pla, 6) 
Eee, 6; 2) = pa, 8) log P(x, Bo) . 


1 i. p(x, Oy) 
dss p(x, Dy) 


From (7) and (32), the ASN function is 
E(x, 0; n) 
L,(0) log 


+ (l= pi@, Oillog (32) 


(33) 
B 


l-—a 


+ [1 — L,(@)] log 


re 1 
— oe . 
a p(x, 1) a: Cane 
Both the numerator and denominator of (33) are functions 
of the slicing threshold x. At the parameters 4) and @,, 
recalling that Z,,(@) = 1 — @ and L,.(@;) = 6) @3) 

becomes 


E(x, 00; n) (34) 
Pee 
4 (1 — a) log 8 — + @ log +—* 
i pe 6;) = ieee p(x, 6,) 
p(x, 80) log p(x, 0) ae [1 p(x, 9) | log al a= § wa, ce) 
and 
Esa, 0,5 n) (35) 
owt zh L= 6 
Blog se B) log — | 
Sitar: pla, 6) zo 1 — p(x, 0) 
p(x, 6;) log D(x, 0) sie [1 pa, 6;) | log 1 and pa, Ao) 
At the indeterminate point, 
l ees p(x, cD) 
f\ = ites BE; 6:) 
Tee ) p= i ees) lor 1 — p(x, >) ’ (36) 
De, 60) i p(x, 6;) ; 


the ASN function can be obtained from (8). For this case 
it is easy to show that 


x 
(1og += 8)(1og <4) 


plat, a) 1 = ple, a) 
LO eae 0 ee 
( TCM RAG eiunu acyl 
Eq. (37) gives the maximum value of the ASN function. 
Applying the a, b, c transformations of (19)-(21) to (83), 
yields the equivalent expression in terms of the parameter 


u. Thus, 


E(a, 0’; n) ‘cs (37) 


bL,(u) + afl — L,(u)) 


BA ws) = + dpe, a) = | ie 
At the indeterminate point uw = 1, (36) gives 
1 
p(x, 1) = sie u=1 (39) 
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and (37) becomes 


BGs Me ap ie -2. 


(40) 
For a given value of the slicing threshold x, the ASN 
function is given as a function of uw, where 0 <u < o. 
From (30), the parameter p of the Bernoulli distribution 
is given as a function of w. It is now possible to obtain the 
ASN function in terms of the parameter p. For a given 
threshold xz, the Bernoulli parameter p is related to the 
parameter @ of the probability density function P(x, @) 
by (26). This gives the ASN function for the Bernoulli 
case as a function of 6. 

Let us now refer to the ASN functions (84) and (85). 
Tt is seen that the numerator is only a function of the 
conditional error probabilities a, 8. The denominator is a 
function of the parameters of the Bernoulli distribution. 
However, we repeat again, the parameters of the Bernoulli 
distribution p(%, 0), p(x, 6) are functions of the param- 
eters 6) and 6, of the probability density of the random 
variable before slicing at the threshold «. The Bernoulli 
sequential test, corresponding to 6) and 6,, is also a 
function of the slicing threshold 2. 

We have previously defined the denominator H(z) of 
the ASN function as the average information per sample 
gained from measurement on condition that @ is the 
parameter of the probability density. This function is 
the average value of the measured quantity which is 
used for making a decision. It is seen that the larger the 
information gain per sample, the smaller the average 
sample number for a given set of parameters. Intuitively 
one would expect these quantities to be related in this 
manner. The function has important properties in its 
own right which make it useful as an information measure, 
exclusive of sequential analysis [6], [8], and [9]. For a, 
8 << 0.5, the practical range of interest, it is seen from 
(34) and (35) that a, 6 fall off exponentially with increase 
in the average number of samples and with increase in 
the information gain. More precisely 


B =e exp [E,(x, CEL en om Once le ae, O03 2) < 0. (41) 
and 
a = exp [—E£,(x, 00; n)H,(x, 01; 2)|; H,(@, 132) > 0. (42) 


(See [12] for the proof that H,,(z) < 0 and H,,(z) > 0.) 
Eq. (34) and (35), in a very compact and simple equation, 
relate the conditional error probabilities, the average 
number of samples required for termination at p(x, 0) 
and p(x, 6,), and the average information per sample 
gained from measurement. It is also seen that the in- 
formation gain function, (32), is a function of the slicing 
threshold a. 

In order to compare the Bernoulli detector for a given 
slicing threshold « to the sequential test before slicing 
[corresponding to the density P(x, @)) at the same param- 
eters (8, 6:1, a, B)], we take the ratio of the two ASN func- 
tions. That is, the efficiency 46,(@, 9%) of the Bernoulli 
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test, on condition that 6) is the true parameter, is given by 
the ratio of (9) and (84), or 


6,(a, 80) (43) 
a p(w, Cy) em res Wes p(x, 61) 
2 p(x, Ao) log pa, A) ai [1 p(x, 60) | log i (x, Ao) 
Ey, (2) 1 
and, on condition that 6, is true, we have 
Oe, 6;) (44) 
p(x, 61) zs eee) 
a pe; 6;) log p(x, A) ap [1 pa, 6) log 1 a D(a, Bo) 


Eg, (2) 
The efficiency of the Bernoulli detector is clearly a func- 
tion of the slicing threshold x. For each Bernoulli test 
which corresponds to a different slicing threshold x, we 
obtain a different efficiency. It therefore seems reasonable 
to find the slicing threshold « which maximizes the 
efficiency 6,(x, 4) or 6,(#, 6,). Since the denominators in 
(43) and (44) are independent of x, this is equivalent to 
maximizing the numerator or what has been defined in 
(32) as the information gain function. From (34) it is 
seen that maximizing the information gain with respect 
to « for given parameters (4, 0:, a, 8) is equivalent to 
minimizing the ASN function on condition that 69 is true. 
Similarly, maximizing the information gain with respect 
to « for given parameters (60, 6:1, a, G) minimizes the ASN 
function on condition that 6, is the true parameter. The 
Bernoulli detector chosen in this manner is optimum in 
the sense described. 

If desired, the efficiency can also be defined at the point 
6 = 6’, where the ASN function is a maximum. From 
(8) and (27) we have | 

fs Le, a) 


(oe Fea oe Tae 
E,,(2”) 


d(x, 6’) = (45) 
More precisely, corresponding to the parameters (60, 4;, 
a, 8), we choose the optimum slicing threshold at that 
value of x for which 


déd(x, 8) iz 


at one of the points 6 = @, @ = 6,, or 6 = @. Whem 
(0; — 0) <1, the threshold x is optimum in the interval 
§ < 6 < 6. The optimization (46) for given (6, 61, 
a, B) is equivalent to finding the threshold x which 
minimizes the ASN function, or 


di (t, O37) _ 
ae = 0 (47) 
at one of the points 6 = 6, 6 = 6, or @ = 6’. 


FE. Threshold Signal Approximation to Information Gain 
Function 


The information gain function for a Bernoulli detector 
at the parameters p(x, 6,) and p(z, 6,) is given by the 
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enominator of (34) and (35), respectively. For the 
weshold signal case, (6, — 6.) « 1, it is desirable to 
btain an approximation to the information gain function. 
n the problems considered here, we assumed that the one 
arameter family distributions are such that, when 
Pee eo) < le lp, 0.) — pa, 8)| << 1 for -allea: Kor 
onvenience, we will rewrite (32) 


e- 
B,@) =p log™ + [1 —pllog;—", a) 
Po ieee Po 
emembering that p, po, and p, are functions of 6, 4, and 
, respectively and of x. 
In (48) let us expand log p,/po in a Taylor series, 


Apo 1 Apo 
log 2 = — 5 tee 49 
Po Po 2 Do - cy 
vhere 
ADo = On =o 
$y a similar expansion, Jog (1 — p,/1 — po) is given by 
era or Apo 1 Apo 
lo = rE 3 ee 50 
1 = po iP Ps Nieap ss 60) 


f we now substitute these expansions into (48) and 
eglect third order terms in Apo, we obtain 


Di — Do Pi + Po 
Bg P= P (y— DD) 
@) pie — po) . 2 


Vhen p = po and p = p,, remembering that these depend 
NM 4, 6:, and x, we obtain from (51) 


(51) 


VAC 805 2) a a, 61; 2) 


fe! [p(z, 01) = pla, Ao) }”_ 
2 p(x, %)[1 — p(x, 4)] 


= (52) 
Ience, the information gain per sample at 6) and @, is 
he same for threshold signals, for example, when 
e — 6) < 1. 

The problem of detecting a sine-wave carrier of small 
ignal to noise ratio in Gaussian noise by Bernoulli trials 
vas treated by Harrington [7], who considers this problem 
or the case of fixed samples. For a given minimum 
letectable signal, he chooses the optimum slicing threshold 
-as the point at which the function 


p(x, 6) — p(x, 60) 
V ple, 6)[1 — p(x, &)] 
; a maximum. For threshold signals, it is seen that the 


rformation gain criterion used here is related to that of 
53) by 


eo) = (53) 


2 
E,,(&, 8032) = —H a, 0); 2) = oe). 


(54) 
‘he maximum of p(r) and p’(r) occurs at the same point, 
ince both functions are positive for all values of r. It is, 
herefore, seen that, for threshold signals, a choice of the 
ptimum slicing threshold based on a maximization of 
(z) is equivalent to a maximization of the information 
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gain function H,,(z). Furthermore, the threshold is also 
optimum at the point p, and it can be shown that it is 
optimum for all po < p < p,. For large signal-to-noise 
ratios, the two criteria do not yield the same results. 
For certain distributions, it is possible for p(r) to be 
infinite for certain sets of parameters, while the criteria 
presented here leads to finite results and has a maximum. 


O10 


.008 p(r) 


E,(z,r) 


ie) LO 2.0 3.0 40 
i) 


Fig. 5—Information gain function of Bernoulli sequential filter for 
a Gaussian signal in Gaussian noise of signal-to-noise ratio 0.50. 


Fig. 5 is a curve of E,,(r, 2) and p(r) for a Gaussian 
signal in Gaussian noise for a signal-to-noise ratio n = 0.5. 
E,,(r, 2) is a maximum at r = 1.55, while p(r) is maximum 


at r = 1.75. Fig. 6 is a curve of the same function for 

.20 
p(r) 
ire 
tana 10 
Eo (zn) 

(0) 
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r 


Fig. 6—Information gain function of Bernoulli sequential filter for 
a Gaussian signal in Gaussian noise of signal-to-noise ratio 1.0. 


n = 1. The difference between the two maxima is quite 
large for this case. For the Gaussian case, it can be shown 
graphically that, for » somewhat greater than unity, 
p(r) does not converge. The trend is clear. For the Rayleigh 
case, it can easily be shown that p(r) diverges for all » > 
V/2. It is felt that the comparison is a good illustration 
of the importance of the information gain criterion. 


IV. APPLICATION TO THE DETECTION OF A SINE-WAVE 
CARRIER IN NOISE OF SIGNAL-To-NoIse Ratio 
Less THan Unity 


The important problem of designing a Bernoulli 
sequential filter for the detection of a sine-wave carrier 
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in noise is now considered. The Bernoulli detector tests 
the hypothesis p = po that noise is present against the 
alternative hypothesis p > p, that signal-plus-noise is 
present. The distribution function of the envelope of a 
sine-wave carrier in noise is given in the literature [2]-[5] 
and [7]. 


pr, 1) = il ye" Tony) dy, (55) 
n = peak-signal to rms noise-ratio, 
I,(nr) = zero order modified Bessel Function. 
Corresponding to testing the hypothesis 7» = 0 against 


n > m, we define a Bernoulli test, p(r, 0) against p(r, 71). 
For the case when only noise is present, 7 = 0 and 


—ir2 


ply, 0) =e* (56) 


Also for small signal-to-noise ratios n = 7, < 1, (55) can 
be approximated by expanding the integrand in a power 
series in 7, and neglecting all orders of 7, higher than the 
second. This yields 


por, nm) [ye = any? + any! Foe $e) 


ip 


(1+ ME yp me yt) ay 


re ni rn 
PS —4r2 fi = s 1 
se (1 4. uy pir, 0) 1 eae | 


From (43) and (52) and using the results of Bussgang 
and Middleton [5] for H,(z), 7» < 1, the efficiency of the 
Bernoulli detector, as compared to the best detector 
given by the theory for threshold signals, is found to be 


Apr, m) — vr, OF. 
pr, 0) [1 aa pr, 0) |: 


6,(7, m) = 6,(r, 0) = efficiency of Bernoulli detector at 
the output of a slicing operator at the threshold « = 7, 
when 7 = 7, and 7 = O are the threshold parameters. 
Substituting (57) into (58) yields for the efficiency 


pir, 0) or 
IFS PG OE 


(57) 


5,(7, ™m) = OM (a 0) = (58) 


5,(r, m) = RG 0) = (59) 


We can eliminate 7 in (59) by making use of (56). Hence, 


Ponares _ plr, 0)[log p(r, O)}. 
6,(r, m) am 6,(r, 0) rx oe vr, 0) 


Since p(r, 0) is a monotonic function of 7, maximizing 
(60) with respect to p is the same as maximizing with 
respect to 7. The value of p(r, 0) which maximizes (60) 
is given by 


(60) 


or, 0) == aged ee AD (61) 


or 
or, 0) = 0-205. 


The corresponding threshold is obtained from (56) as 
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oe s arts. (62 
R = voltage sample of carrier envelope. 
N = rms value of noise. 


The efficiency at the optimum threshold is given by 


6,(m) = 6,(0) = 64 per cent. (63, 


'p Spe 1.78 
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Fig. 7—Information gain function of Bernoulli sequential filter 
for a sine-wave carrier in noise of signal-to-noise ratio 0.50. 
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Fig. 8—Information gain function of Bernoulli sequential filter 
for a sine-wave carrier in noise of signal-to-noise ratio 1.0. 
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Fig. 9—Average sample number function of Bernoulli sequentia 
filter for a sine-wave carrier in noise of signal-to-noise ratio 0.50 
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ig. 10—Average sample number function of Bernoulli sequential 
filter for a sine-wave carrier in noise of signal-to-noise ratio 1.0. 
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ig. 11—Operating characteristic function of Bernoulli sequential 
filter for a sine-wave carrier in noise. 


‘his checks the results given elsewhere [5] for the same 
roblem. 

Figs. 7 and 8, p. 130, are curves of the information gain 
mction at the threshold parameters for signal-to-noise 
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ratios m = 0.5; and 1. Fig. 9, p. 130, and Fig. 10 are curves 
of the ASN function for 7, = 0.5, 1 and a = 6 = 0.01. Fig. 
11 gives curves of the OC function for 7, = 0.5, 0.75, 1 
Mole = 6 = MOI. 
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A Theory of Weighted Smoothing’ 


LOUIS A. ULEt 


Summary—tThe problem attacked in this paper is that of a system 

ith a stationary random error input, a nonrandom signal input and 
nonrandom error input. The output of the system is required to 
ave a minimum weighted response to the random error and other- 
ise to consist only of arbitrary functions of time linearly related to 
1e nonrandom signal input. The more general case, which includes 
random signal as well, is not considered. 


INTRODUCTION 


T HAS LONG been known that it is possible to give 
[ different values of weight, depending on the source, 
time, medium, and other circumstances, to pieces 
f evidence from which a particular conclusion is to be 
rawn, independently of the reasoning process used to 


* Manuscript received by the PGIT, November 23, 1956. 
} Potter Pacific Corp., Malibu, Calif. 


arrive at the conclusion. In data smoothing the evidence 
to be so weighted is the input signal to the system, and 
the conclusion to be drawn, or output signal, is some 
function of the smoothed data. In the weighted smoothing 
considered in this paper, the input data will be weighted 
temporally only. Typically, though by no means neces- 
sarily, recent data will be given more weight than old data. 

The concept of weighted smoothing is not new.’ Filters, 
for example, which consider inputs occurring within only 
the past 7 seconds,’ may be said to give such data an 


1L. A. Ule, “Weighted least-squares smoothing filters,’ IRE 
Trans., vol. CT-2, pp. 197-203; June, 1955. 

27, A. Zadeh and J. R. Ragazzini, ““An extension of Wiener’s 
theory of prediction,” J. Appl. Phys., vol. 21, pp. 645-655, July, 
1950. 
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even weight and to any older inputs zero weight. This 
data weighting is not to be confused with the weighting 
function of a linear system; the latter determines the type 
of conclusion to be drawn from the input although its 
exact form will depend on the way the input is to be 
weighted. 

Since the various theories regarding smoothing are 
optimum solutions, they can differ only in the nature of 
the problem attacked. These problems differ in the nature 
of what is known a priort about the input, the way 
this input is to be weighted temporally and the type of 
output that is desired. The Wiener’ filter, for example, 
assumes that the input, both signal and error, is a station- 
ary time series. It assigns all past inputs uniform weight 
and future inputs zero weight. The output is required to 
differ from the signal part of the input with a least average 
squared error. The filters of Zadeh and Ragazzini* extend 
the input to include power series in time of finite degree 
n, give inputs up to 7 seconds old even weight and all 
other inputs zero weight. The output is required to differ 
least from any linear operation on the input. 


FORMULATION 


In this paper, the input will be weighted nonuniformly, 
and within limits, in any desired manner. This weighting 
will be called the memory of the filter to distinguish it 
from the system weighting function which performs other 
operations in addition to weighting. The input will be 
assumed to consist of the sum of three kinds of components: 

1) A linear combination of r known functions of time, 
described either by mathematical equations or by graphs, 
and corresponding to the nonrandom components of the 
signal. 

2) A linear combination of n — r known functions of 
time, which with the r functions above are linearly 
independent within the memory of the filter. These 
functions correspond to the nonrandom part of the error 
input. 

38) A stationary known error spectrum corresponding 
to the random part of the error input. A stationary random 
signal spectrum is not assumed since this case is covered 
by the Wiener filter with the addition of suitable con- 
straints. 

The output of the filter will consist of some response to 
the random error input and any r functions of time either 
separately or in summation proportional in amplitude to 
the r signal components of the input. These functions, 7 
in number, may be the same as the signal input, they 
may be predictions of the signal input components not 
necessarily all by the same amount, they may be linear 
operations on these inputs not necessarily all the same 
or they may bear no functional relation to the input 
components other than that their amplitudes are each 


3N. Wiener, “The Extrapolation, Interpolation, and Smoothing 
of Stationary Time Series,” John Wiley and Sons, Inc., New York, 
N. Y.; 1949. 

4 Zadeh and Ragazzini, loc. cit. 
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proportional to one of the components of the input. The 
special case of zero output in the presence of a particulas 
input is used to reject the undesired n — r input com- 
ponents. 

The desired signal inputs we shall call f;(4), 7 = 1, 2 

- r. The undesired components will be similarly desig- 
nated but with =r-+1,r+2,-:: 7”. 

The random error input will be represented by its 
autocorrelation function R(r). This function, by the 
Wiener-Khintchine relationship is the Fourier cosine 
transform of the power spectrum. 

The desired output will be represented by the sum of 
functions p,(f), 7 = 1, 2, --- nm, where p,() = O for 
1=r+i,r+ 2, --- n. Some of the forms that these 
functions can assume are: 


Simple estimation: pit) = f.(d) 
Prediction: o.(f) = f-;<¢€+ a) 
Differentiation: pit) = df Oat 
Attenuation or gain: p(t) = kf-@) 
Arbitrary: For an input a,f;(é 


the output is a,p;(d). 


The amplitude of each output component is proportional 
to the amplitude of the associated input component. As 
will be seen later, these components can be obtained 
separately or in summation. 

In producing the above outputs, there is no question of 
error, since we will require that they be produced exactly 
in the presence of the associated inputs and in the absence 
of noise. The only error that remains is the response to 
the random noise input. Even this latter can be made as 
small as desired by the error weighting function to be 
defined, but the resulting filter will have a correspondingly 
narrow bandwidth and long settling time. If a signal 
spectrum were involved, as in the Wiener filter, this 
would not be the case, since a filter with too narrow a 
bandwidth would produce increased error due to the 
attenuation of the random part of the signal. 

As indicated, the error output of the filter is to be 
weighted temporally by a memory function prior to 
minimization. We represent this function by H(r), where 
7, measured from the present, represents time past. For 
example, 7 would be 10 for a time 10 seconds ago, zera 
for the present, and minus 10 for a time 10 seconds from 
now. Intuitively, we will require H(r) to be large for 
those times relative to the present when the signal input 
is the most significant, small when the signal input is 
likely to be obsolete or erroneous and zero when the 
signal input is known to be irrelevant. For a realizable 
filter H(r) is zero for all future inputs: 7.e., H(r) = 0, for 
rt < 0. The case where the memory function is time- 
varying will not be discussed here since it is a rathe1 
uncomplicated generalization. 

The filter weighting function W(r, f), as distinct from 
the memory function, will be a function of time ¢, since 
the generality of the signal inputs and outputs assumec 
here will in some cases require time-varying filters. Fo 
such a filter, the output g(/) for an input f(é) is obtainec 
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om the filter weighting function by means of the con- 
olution integral; 7.e., 


= [ WG, of = 9 dr (1 


Since the filter will be constrained to produce its outputs 
ithout error in the presence only of its proper inputs 
(f), 4 = 1, 2, +--+ n, the only error that will otherwise 
sult will be due to the random noise input. The ensemble 
verage square of this error o°(¢) is as follows: 

co =f [ We, 0Wy, 0R@- Hardy 2 
here A(r) is the autocorrelation function describing the 
undom noise input. It is possible to make this error 
rbitrarily small, yet meet all other requirements, by a 
Iter with a very long memory, or, what amounts to the 
ame thing, by a filter with a very narrow bandwidth. A 
1ethod often used to increase the bandwidth and reduce 
ne settling time of the filter is to constrain the weighting 
inction to be zero for r > T, where T is a finite number. 
uch a filter will have a settling time no greater than 7 
econds but will require delay lines in its construction. 
‘he method proposed here to obviate this difficulty is to 
reight the error by the memory function H(r) to produce 
weighted error o¢(t) as follows: 


eee ot Wie OW) 
ORS WE aareic 


Since a subsequent minimization is implied, it is clear 
hat if the weighted error is to be finite, let alone a mini- 
aum, W(7, t) must be zero when H(z) is zero. This 
ontuitive requirement is the reason for putting the memory 
unction into the denominator. By setting 


HG) = UG) — UG —T); (4) 


vhere U(r) is the unit step function, we will, for example, 
imit the filters memory to a duration of 7 seconds. 


R(a — y) dx dy. (3) 


SOLUTION 


Before proceeding to minimize the weighted error by 
he calculus of variations we meet the other requirements 
reviously set forth by imposing the following set of 
onstraints. 


PW, O46 = 2) dr = palo 


An infinite number of filters meet these constraints, 
nd of this number we select the one for which the weighted 
rror (3) is least. Applying the calculus of variations we 
btain the following integral equation 


p FED ey — a) dr = SUOMI, ©) 
0 

vhich together with constraints (5) determines the 
veighting function W(r, ¢) of the optimum filter. The 
unctions \,(¢) are the n Lagrangian multiplers which 
re eliminated from the n + 1 relations (5) and (6). 
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SoLuTION For WuireE NoIsE 
An explicit solution for white noise is obtained quite 
readily since in this case R(r) = 6(r), where 6(r) is the 
unit impulse. In this case (6) reduces to 


Wer, ) = HO) DAI = 9), (2) 
substitution of which into (5) results in 
YX. f HOU - O6-Ddr=n, @ 


j= 1, 2, ie gil: 
a set of simultaneous equations in the unknown ),(é). 


By defining 


Seal) [ WOO = oF 


AG a T) dr == ik, (9) 
where 6;, is the Kronecker delta, the numbers C;,(f) 
become the elements of the inverse matrix corresponding 
to the solution, so that 
A) = De ess(OpiO, “jn. (10) 
Examination of (9) and (10) shows that the only 
conditions that need be imposed on H(r) is that the 
matrix defined by the integral in (9) be nonsingular so 
that a solution for the \,(f) exists. Substitution of (10) 
into (7) produces an explicit expression for the filter 
weighting function; 2.e., 
Wr, t) = H(z) yy, > C(O fCt — 1)p;(). (11) 
t=1 7=1 
This filter is time-varying. An exception occurs if the n 
functions f;(¢) are n linearly independent solutions to an 
nth order linear homogeneous differential equation with 
constant coefficients and the functions p,(¢) can be 
obtained by time-invariant operations on the n functions 


fi (2). 
SOLUTION FoR ARBITRARY NOISE 


The simplicity of the solution for white noise suggests 
the following approach in other cases. The filter is divided 
into two cascaded networks, the first of which is devised 
to turn the noise spectrum at its input into white noise 
at its output. The second network then has a white noise 
input. Constraints (5) are imposed upon the combined 
network and the second filter is chosen to make the 
weighted error due to the random noise input a minimum 
subject to these constraints. The mean square response 
to the random noise for the combined network is now as 
follows 


ro = ff MOW.@Me, oWG, D 


‘Ria — y +r — 8) dx dy dr ds, 


where W,(r) is the known weighting function of the first 
network and W,(r, #) is the yet to be determined weighting 


(12) 
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function of the second network. As before, we will minimize 
the weighted error oo(¢), defined as follows, 


Wr, )W2As, 0) 
H(r)H(s) 


oi) =f. [ W@W) 


70 ~ 0 


‘Ria —y +r —s) dx dy drds, (13) 
except that now the constraints become 
[ [ W@W, 01.0 = 2 = y) dx dy = pO, 
70 70 
ie len eee Ie (14) 


Applying the calculus of variations again and noting 
that the second network has a white noise input, the 
weighting function of this network becomes 


W.(7, t) 


=H) DO | Wi@s.e— + - a) ae. (15) 
i=1 0 

Eliminating the Lagrangian multipliers, as before, we 
obtain 


WG ta — oH 


4=1 


> COv.0 


.[ Wi(a)fi(t — + — x) dz, (16) 
0 

where the numbers c,;;(¢) are the elements of the inverse 
matrix of the set of equations whose matrix elements are 


n.o= f [ rome 


Wifi —2 — aft — y — 2) da dy dz. 


As before, the combined filter is, in general, time- 
varying and of considerable complexity. It is, however, 
readily synthesized. Examination of the convolution 
integral (5) shows that factors of the weighting function 
which are functions of ¢ — 7 are synchronized with the 
input signal. The input signal is, therefore, simply multi- 
plied by these factors before being acted upon by the 
convolution integral. Factors which are functions of 7 
only are time invariant and generally denote filter net- 
works which can be synthesized through the Laplace 
transform. Finally, the remaining factors which are 
functions of ¢ only can be factored out of the integral; 
they, therefore, multiply the output of the preceding 
filter sections. 

The above is not the only mechanization possible. 
Functions of ¢ — 7 can often be factored into functions 
of t and 7 only, or otherwise manipulated, so that when 
the preceding procedure is applied a different configuration 
results. 

A second order mechanization is blocked out in Fig. 1. 
It is interesting to note that, other than in input network 
W,(r), the only smoothing network that appears consists 
of n identical filters whose weighting function is H’(r), 
the filter memory function squared. After passing through 


(17) 
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INPUT NOISE PLUS Za;f;(t) 


W\(T) 
FILTER 


MULTIPLY 
7 B 
Jot 


MULTIPLY 
BY 


J ofr(t)d x 


Y 
y(t-x)d x 


H?(7) 
FILTER 


H?(7) 
FILTER 


MULTIPLY 
BY 
Cy2(t) 


MULTIPLY 


MULTIPLY 


MULTIPLY 


OUTPUT NOISE PLUS ¥a;p;(t) 


Fig. 1—Second order mechanization. 


the input network W,(r), the input signal is multiplied 
by n multipliers, each multiplying by one of the functions 
f,(t) as modified by the input network. This matrix of n 
multipliers is then properly called an autocorrelation 
matrix. The n-multiplied signals are individually smoothed 
and then transformed to 7 constant voltages by the matrix 
of multipliers C,;(¢). These n constant voltages are the 
coefficients a; of the input f(f); viz., 


n 


AAC) = SPOMHOE 


i=1 


(18) 


These de voltages can be modulated into any output 
functions p;(¢) desired by output multipliers and these 
outputs can be obtained separately, in summation, or 
only in part. As pointed out previously, there are occasions 
when no time-varying multiplications are required. 


EXAMPLE 


The example to be given will be that of a phase detector 
which will have a de output equal to the peak value of the 
sine component of a random phase sine wave signal of 
one radian frequency, and no output to the cosine com- 
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nent. Of course, if desired, a de output equal to the 

k value of the cosine component can be obtained with 
additional mathematical labor since the matrix in- 
ersion to be performed is the same regardless of the 
esired output. Specifications for this detector are as 
llows: 


Nonrandom input: a, sint + a, cost 
Noise spectral density: 


ap <2 il 
ap ys 


Desired output: p,(f) = a, 
p(t) = 0 


Filter memory function: H(r) = e°°°’. 


The weighting function W,(r) which will make the 
joise input into the second network white is quite ob- 
riously the inverse Laplace transform form of 


s+ 2 
st+1 


so that 
Wir) = 67) +e". 
Using (17) we next calculate the functions 6,;(é). 


These are: 


NPUT NOISE PLUS 
1,SINt + a,COSt 


MULTIPLY MULTIPLY 


O.5SINt +1.5COSt 


1.5SIN¢ — 


0.5COSt 


Fig. 2—Phase detector. 
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b(t) = TR an (1.4 cos 2t — 2.075 sin 2¢) 
Uist) t= bat) = ne ain (1.4 sin 2¢ + 2.075 cos 2¢) 
bys(t) 201.255 Si a (1.4 cos 2¢ — 2.075 sin 2¢). 


The elements of the inverse matrix without approxi- 
mation are: 


C1,(t) = 0.0802 — 0.0016 (1.4 cos 2t — 2.075 sin 22) 
e3(t) = C(t) = 0.0016 (1.4 sim 2t -F 2.075, cos 21) 
C2(t) = 0.0802 + 0.0016 (1.4 cos 2¢ — 2.075 sin 2¢). 


The integrals of (16) are next evaluated for sint and 
cost. These are, respectively: 


1.5 sin (t — 7) — 0.5 cos (¢t — 7) 
and 
L.5'cos (i = 7) 0S smi i) 


Substitution of the foregoing expressions into (16) 
results in 


W.(r, t) 
=e °'{ [0.0802 — 0.0016 (1.4 cos 2¢ — 2.075 sin 2¢)] 
< [1.5 sin (t — 7) — 0.5 cos (¢ — 7)] 
+ 0.0016 [1.4 sin 2¢ + 2.075 cos 2¢] 
x [1.5 cos (t — 7) + 0.5 sin (t — 7)]}. 


The complete phase detector is shown mechanized in 
Fig. 2. The mechanization shows two de outputs equal 
to the peak of the sine and cosine components of the input 
sine wave. To meet the requirement of just one output, 
two multipliers and an adder are eliminated from the 
circuit. 


CoNCLUSION 


A contribution of this paper lies in the extension of the 
class of nonrandom functions which can be used to 
represent a signal. These functions may be defined by a 
mathematical expression or by graphs and need only be 
finite and piecewise continuous. A practical aspect is the 
possible specification of the memory function H(r) so 
as to obtain lumped constant filters as against the con- 
ventional specification that H(r) = U(r) — U(r — T), 
which requires a filter with distributed constants. Finally, 
a way has been devised to apply the weighted error 
concept to filters having an arbitrary stationary noise 
input. The generalization could have been carried further 
to include nonstationary time series and a time-varying 
memory function. At this stage of the art it is sufficient 
to propose the problem; its solution is a foregone con- 
clusion. 
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The Response of a Phase-Locked Loop to a Sinusoid 
Plus Noise’ 


STEPHEN G. MARGOLISt 


Summary—The phase-locked loop is a practical device for separat- 
ing a sinusoidal signal from additive noise. In this device the in- 
coming signal-plus-noise is multiplied by a noise-free sinusoid 
generated by a voltage-controlled oscillator (vco). The filtered 
product is used to lock the phase of the vco output to that of the in- 
coming signal, thus producing a relatively clean version of the in- 
coming signal in which the noise manifests itself as a small phase 
modulation. Analysis of this noise-produced phase modulation is 
complicated by the presence of the multiplier at the input to the 
loop. This paper presents a perturbation method which reduces this 
inherently nonlinear servo analysis problem to the analysis of a 
series of linear systems, the first of which is related to the linear 
model used by previous authors. The perturbation technique per- 
mits the phase modulation resulting from an arbitrary noise input 
to be computed to any desired accuracy. This analysis is particularly 
useful in predicting loop performance when it is used as a narrow- 
band receiver in a phase-comparison angle-measuring system. 


List oF SYMBOLS 


HE FOLLOWING is a list of the symbols used in 

ale this paper: 

A peak amplitude of sinusoidal input, volts. 

peak amplitude of sinusoidal input to reference 

channel, volts. 

A,,(6,) = peak amplitude of sinusoidal input to measure- 
ment channel, volts. 

C = capacitance used in compensating network, farads. 


A 1 


f(t) = random time-function, dimensionless. 

g(t) = random time function defined in (30c) dimensionless. 

H(S) = transfer function of phase-locked loop, relating 
small transient changes in input phase angle to the 
resulting changes in output phase angle. 

h(r) = inverse L-transform of H(S). 

i ae ae noise-to-signal ratio, dimensionless. 

C= “ ft adien = loop gain constant, radians/ 
second. 


N = noise amplitude factor, volts. 

N, = noise amplitude factor for reference channel, volts. 

N,,(9x) = noise amplitude factor for measurement channel, 
volts. 

R,, R, = resistances used in compensating network, ohms. 

S = complex frequency, radians/second. 


* Manuscript received by the PGIT, November 23, 1956. This 
paper was presented at IRE-WESCON, Los Angeles, Calif.; 
August 21-24, 1956. Results are presented of one phase of research 
carried out at Jet Propulsion Lab., Calif. Inst. Tech., under contract 
no. DA-04-495-Ord 18, sponsored by the Dept. of the Army, Ord- 
nance Corps. 

+ Westinghouse Electric Corp., Pittsburgh, Pa. Formerly with 
Jet Propulsion Lab., Calif. Inst. Tech., Pasadena, Calif. 


t = time, seconds. 

T = averaging interval, seconds. 

T, = (R, + R,)C = larger time-constant of compensating 
network seconds. 

T, = R.C = smaller time-constant of conpensating net- 
work, seconds. 

v,(t) = output of voltage-controlled oscillator, volts. 

v,(t) = input to phase-locked loop, volts. 

v.(t) = output of multiplier, volts. 

v4(t) input to control terminal of voltage-controlled 
oscillator, volts. 
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v;(t) = output of angle-measuring system, volts. 
v,,(t) = input to measurement channel from antenna, volts. | 
v,(t) = input to measurement channel from reference - 


channel, volts. 
(volts)-second | 


W = noise spectral density, Se 

Aw = half-bandwidth of narrow-band noise, radians/ 
second. 

Aw! = = = normalized half-bandwidth of narrow-band 
noise, dimensionless. 

6, = angle between reference line and line joining noise- 
source location and antenna location, radians. 

6, = angle between reference line and line joining signal- 
source location and antenna location, radians. 

(4) = Aveo — A; = Noise-produced phase error, radians. 

A(t), A2(t) --- A,(4) = components of A(t), radians. 


\,(t) = phase angle of input sinusoid, radians. 

A;(s) = L-transform of ),(t). 

Mveo(Z) = phase angle of voltage-controlled oscillator out- 
put, radians. 

Aveo(s) = i=transtorm: olen): 

o, = rms noise level at input to reference channel, volts. 

rms value of f(t), dimensionless. 

rms noise level at input to measurement channel, 
volts. 

7 = dummy variable used in convolution integral, seconds. 


On ea 


C, 


w) = w. — w, = difference between center-frequency of 
noise band and angular frequency of input sinusoid, 
radians/second. 

(a) 5 he : . 

n= K = normalized difference frequency, dimensionless. 

w, = angular frequency of input sinusoid; center-frequency 
of voltage controlled oscillator, radians/second. 

w, = center-frequency of narrow noise band, radians/ 


second. 
wy = closed-loop bandwidth of compensated phase-locked 
loop, radians/second. 


} 


INTRODUCTION 


The basic function of a phase-locked loop is to lock 
phase of the output of a voltage-controlled oscillator 
co)’ to the phase of an incoming sinusoidal signal. A 
operly designed loop is capable of performing this func- 
n even in the presence of noise power which greatly 
ceeds the signal power.” The noise manifests itself as a 
ndom phase modulation of the vco output. This random 
ase modulation is usually treated by deriving an equiva- 
t transfer function for the loop, by which transient 
langes in input phase can be related to the resulting 
anges in output phase; the incoming signal-plus-noise is 
sually approximated by a signal phase-modulated by 
nise, and the output phase modulation is then found by 
ussing the input phase modulation through the equivalent 
ansfer function. This paper takes a somewhat more direct 
oproach to the problem, and thus avoids the necessity of 
mnverting a signal-plus-noise to a signal phase-modulated 
y noise; consequently avoiding defining equivalent phase 
ngle when the noise power greatly exceeds the signal 
ower. The results are easily interpreted in terms of the 
oise-free behavior of the loop, which will be reviewed first. 


AMPLIFIER 
GAIN=G 
Vo = cos (w, t + Aveo) 
VOLTAGE - CONTROLLED 
OSCILLATOR 


Fig. 1—Basic structure of a phase-locked loop. 


1 =A sin(w,t + dA, )+ NF(t) 


MULTIPLIER 


Ve 


Fig. 1 shows the basic structure of a phase-locked loop 
ith a noisy input. Assuming for the moment that 
. = 0 (z.e., the input signal is free from noise) and 
eglecting terms with frequency 2w,, the output of the 
mplifier is 

SGA 
amis 


sin (A; a Nee) = ae eee) 


for eee) > (1) 


t is assumed that the sensitivity of the vco is 1 radian 
er second per volt,’ so that 


1 radian 


saa aS volt-second 


ee (2) 


‘ombining (1) and (2), the equation relating the veo 
hase to the input phase is* 


1In this paper, the term voltage-controlled oscillator refers to 
n oscillator in which the deviation of the output frequency from 
gs nominal valve is proportional to a control voltage. 

2R. Jaffee and E. Rechtin, “Design and performance of phase- 
cked circuits capable of near optimum performance over a wide 
nge of input signal and noise levels,’ IRE Trans., Vol. IT-1, 
. 66; March, 1955. : 

3The actual numerical value of the vco sensitivity can be 
ssorbed into the gain G. : ; 

4 By definition, K has the units radians/second. 
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ALS ae TON aa Ky; (3) 


where 


1 radian 


=, eh SSS 
Hh MIS volt-second 


The steady-state solution is \,.. = d;. The response of 
this simple loop to small changes in the input phase ); 
can be found by L-transforming (3). The result is 


Aveo( S) kK 
A(S) S+K 


which defines the equivalent transfer function for the 
loop. Here A,,. and A; are the transforms of \,,, and X,, 
respectively. 

In the simple loop the constant K determines both the 
bandwidth of the transfer function and the range of 
frequencies over which the loop will lock (pull-in range).° 
These parameters may be controlled separately by in- 
cluding an RC compensating network, connected between 


MULTIPLIER 


Vo = COS (w, t + Aved 


H(S) = (4) 


v,=sin(w,t +d,) + NF(t) 


ez: 


LOOP. 
FILTER © Re 


VOLTAGE -CONTROLLED 
OSCILLATOR 


AMPLIFIER 
GAIN=G 


Fig. 2—Phase-locked loop with RC compensation. 


the multiplier output and the vco input, as shown in Fig. 
2. For the compensated loop.’’® 


TRAP IK 
2 S eked 
HS) I: (5) 
(CS) ee Sen 
IBS T, a 
where 


Me = (R 1 — 1055) G 
se = IC} 
and 


1 radian _ 


any 
i AX volt-second ’ 


as before. Here again, the steady-state solution 18 \yoo = Ai 
and the pull-in range is + K radians per second, but the 
bandwidth of the transfer function wy is equa! to V/K rid lic 
For properly damped response to changes in ), it is usual 
to choose 


5W. J. Gruen, ‘Theory of afe synchronization,’ Proc. IRE, 
vol. 41, pp. 1043-1048; September, 1953. See pp. 1045-1046. 

6P. F. Ordung, J. E. Gibson, and B. J. Shinn, ‘Closed Loop 
Automatic Phase Control,” presented at AIHE Summer and 
Pacific General Meeting, Los Angeles, Calif.; June 21-25, 1954. 
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and to make 


geil 
Ih dhe 
THe ANALYSIS OF NOISE IN SIMPLE Loops 


Referring again to lig. 1, the input to the simple loop is 
v, = Asin (wt + r,) + Nf(O). (6) 


For convenience in the subsequent analysis, the noise has 
been written as the product of a constant N and a time- 
varying part f(¢). The vco output is 


Vo = COS (aif + Ayoo)- (7) 
The multiplier forms the product vov,, and this is 


A |2N , 
V2 a 2 EE fO cos (wit Si Nee) ae sim (A; -, eM (8) 
Here again, periodic terms with frequency 2w, have been 
ignored. In practice, the vco does not respond to these 
terms when they appear at its input. The amplifier output 
is 


VM = Gv (9) 


and because v, controls the frequency of the voltage- 
controlled oscillator, 


‘ 1 
cepa eA (4.5). my 
Eqs. (8)-(10) combine to give 
\ + Ksind = KJf(é) cos @t + A; + A) (11) 


where K = 3GA, J = 2N/A, and X has been written for 
Aveo — A; Eq. (11) describes the behavior of the simple 
loop without any approximations other than those 
implicit in the use of an ideal multiplier to represent a 
physical component. 

In cases of practical interest the phase modulation } is 
small—eertainly less than +90°. In addition, if the 
statistical properties of the time function f(f) are fixed, 
the amplitude of \ must increase if N, which may be 
regarded as the amplitude of the noise, is increased. Thus 
if \ is written as a power series’ 


A(t) = Ao(t) == JX,(0) ote J°r2(1) sy Jr3(t) = ae 


the solution of (11) can be reduced to solving for the 
coefficients Ao(t), Ai(4), A2(4), As(4), --- each of which is 
a random time function. This is easily done by substituting 
the series (12) into (11) and replacing sin \ by (A — 
\°/6 + ---) and cos \ by (1 — 7/2 + ---) whenever 
they appear and then equating the coefficients of equal 
powers of J. The result is the series of linear equations, 


(12) 


No = 0 
A, + KA, = Kf(t) cos @,t + X,) 


(13) 
(13a) 


7 The use of this series expansion was suggested by L. R. Welch 
of the Jet Propulsion Lab., Pasadena, Calif. 
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ho =e Ky, Fi K[—Mf() sin (w,t = di) (13b) 

A3 + Ky3 (13qq 
- x|™ — © J(0) cos (wt +d.) — Daf sin (it + »» | 
Aw AA, = Ka: function. of \n=reNaeak ale 


These equations have a simple interpretation. Eq. (13a) 
together with the series (12) implies that a first approxi- 
mation to the effect of noise on the loop can be found by 
analyzing the system shown schematically in Fig. 3(a). The 
noise f(é) is multiplied by cos (#,¢ + ,) and the resulting 
low-frequency noise after passing through a low-pass 
filter with transfer function K/(S + K) and an attenuator 
with a transmission equal to J gives the phase modulation 
X(t). 

Tig. 3(b) is a schematic representation which includes 
the effect of the first three terms in the series (12). An 
even more detailed physical model can be constructed by 
including four terms in the series (12) as shown in Fig. 
3(c). Despite the seeming complexity of these models, 
the computation is straightforward in that there is no- 
feedback; all signals flow from left to right. In addition, 
each equation contains only one unknown. 

The physical model shown in Fig. 3(b) is useful in finding 


cos (w,t +;) 


| sin(w,t + ,) 


f(t) 


cos(w,t +Aj;) 


sin(w,t+ dj) 


A(t) 


wt 
iy f (t) cos (w,t + dj) 


(c) 


Fig. 3—(a) First approximation to effects of noise on a simple loop. 
(b) Model for second approximation to effects of noise on a simple 
loop. (c) Model for third approximation to effects of noise on a 
simple loop. 


y steady-state phase error which may be produced 
en the noise Nf(é) is multiplied by the phase-modu- 
ted vco output. Any correlation between these waveforms 
uld be expected to produce a de component in the 
tput of the multiplier which would eventually manifest 
self as a steady-state phase offset. The physical model 
kes it clear that \, contains no steady component, but 
at \, may contribute a steady-state phase error owing 
the multiplication of the time function f (#) sin (w,t + d,) 
Ai, Which is just a filtered version of f(t) cos (wt + X,). 
Since the impulse response of a system with transfer 
netion K/(S + K) is Ke*’ the time function \, is the 
nvolution of Ke*’ and f(é) cos (wit + X,), ve, 


i= iz iKeae tC — 7) cos [w,(¢ — 7) + d;] dr (14) 


1d the average value of ), is just the average value of 
Ai f() sin (w,é + A;) or 


2 ° ae at x —Kr 
oe oF ee og ee 


-cos [w(t — 7) + X,] ahs sin (w,¢ + X,) dt. (15) 


lavg 


nce f(t) is assumed to be a random time function with 
) periodic components, (15) reduces to 


K [o) 


Noave = ig 6 "$;-(7) Sin w,7 dr (16) 
here ¢,, has its usual definition, 
: 1 a 
oy = lima f SOfE- dr AD) 


or the specific case in which f(¢) is white noise which 
1s been passed through a filter with an ideal rectangular 
and-pass characteristic, as shown in Fig. 4, 


BANDPASS FILTER 
TRANSFER RATIO =4H,(S) 


WIDE - BAND 
NOISE SOURCE 


|H,|* 


(0) We - Aw ' 


@c 


we t Aw We 


Fig. 4—Generation of f(t). 


17 A 
oy, = 4W ee og WoT (18) 
=o, a COS @,T (18a) 


here W is the noise power per unit bandwidth. For this 
pe of noise, 
2 o —Kr 
Age a i <__ sin Awr COS w.7 SiN w,7 dr. (19) 
OW Jo “ff 
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The definite integral (19) can be found in integral tables; 
its value is 


aS EG in At +. + Ao)’ 
rok 1640 ~K? + @, + @, — Ao) 
Ko; K* Be (w, Oy ca Aw)” 
16Aw i K?+@—-—o, — Aw)? ay 


If the noise is confined to a narrow band, w, + w, > Aw. 
In this case, the first term in (20) becomes small compared 
to the second term; evaluation of the second term is 
simplified by the introduction of the normalized variables 


(Ree te te 
Oo = K 
and 
Aw! = = 
For narrow-band noise 
2 , 1\2 
Ga ~ ; OF I 1s i (w5 +. Aw’) (21) 


6Aw aur + (w — Aw’)? 


which is an odd function of w,. The resulting phase error is 


LN e7 1+ @ + Aw’)” 
4 A’Aw 1 + @ — Aw)’ 
The ratio N’o7/A” is [from (6) and (18)] just the square 
of the ratio of rms noise to peak signal, measured at the 
input to the loop. The phase bias is thus a function of: 
N’o?/A’, the ratio of mean-square noise to the square of 
signal amplitude; Aw’, the ratio of noise bandwidth (see 
Fig. 4) to loop bandwidth K; and «/, the ratio of wo to 
loop bandwidth. Here w, is the difference between the 
signal frequency w, and the noise center frequency, a.. 
Fig. 5 is a plot of J°d, vs w, for N’o7/A? = 1 and Aw’ = 1. 


lie Nowe ;ln (22) 


Steady-State Phase Offset 
Produced by Narrow- Band Noise 
at Input of an Uncompensated 

Phase -Locked Loop 


° 


2 


n 
4 
< a 
a 
a 
a 
° 
° 
& 
“ 
“4 
” NOTE 
, 
i 
ae ho! = Aw/K = 1 
NP of /A® =I : 
cc Bo) =F (2 ))// Kae ee (ie (con 
1 ! ! [zee | 
10 8 6 4 = ie) 2 4 6 8 10 


Fig. 5. 


The phase error vanishes when w, = 0; 7.e., when the 
center-frequency of the noise band coincides with w,. 
Norse in Loops with RC CoMPENSATION 


The analysis of practical phase-lock circuits, which 
include an RC compensation network as shown in Fig. 2, 


8 W. Grobner and N. Hofreiter, ‘“Integraltafel; Zweiter Teil, 
Bestimmte Integrale,” Vienna, Springer-Verlag; 1950. 
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follows the same general plan as that used to analyze the 
uncompensated loop. As before, ignoring components in 
v. With frequency 2w, 


vy, = Asin@t +A) + NSO 
Vo = COS (at + Aveo) 


(23) 
(24) 


— 


= 5 E Ff cos it + Aveo) FSI (Xe — | (25) 


and 

: 1 

Ais rs (acon): (26) 
Because of the inclusion of the RC network, however, 

ve + Ty, = GO. + T202) (27) 
where 7, = (h, + #.) Cand ©, = 8.C- (28) 


Eq. (27) becomes 


K 


h+ak = Klas cos (a, + A; + A) — sindA 


+ r, = [J f(t) cos (wt + A; + A) — sin al} (29) 
Substitution of the power series (12) into (29) and collec- 


tion of the coefficients of equal powers of J lead to a 
series of equations which resembles the series (13): 


wy (30) 
ee (* + oe zi - ae = FO) cos at + A) 
4 a < [f() cos wt +] (30a) 
s at a Pif() sin (at +] (0b) 
eects (4 + AP + - ds = - g(t) + a “ g(t) 


where 


at) = | —¥ 40) 008 (ot + 9) 


— dof (t) sin (wt + | (30c) 


These equations permit the effect of noise to be interpreted 
in terms of a series of block diagrams. The interpretation 
of (30a) is shown in Fig. 6(a). A first approximation to the 
phase-modulation is found by multiplying the noise f(¢) 
by the unmodulated vco output cos (wf + d,) and passing 
the resulting waveform through a filter with transfer 


function 
Klis K 
a ae 


2 1 KT, K 
S a: i aie Te S te fi 


H(S) — 
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f (t) 


cos (w,t + d,) 


sin(w,t + dj) 


cos(w,t+d;) 


2 
-x f(t) cos(w, t+ Aj) 


(c) 


Fig. 6—(a) Model for first approximation to effects of noise on a com- 
pensated loon. (b) Model for second approximation to effects 
of noise on a compensated loop. (c) Model for third approxi- 
mation to effects of noise on compensated loop. 


and an attenuator with transmission J. More detailed 
models based on (30b) and (30c) are shown in Figs. 6(b) 
and 6(c). As before, all signals flow from left to right; 
there is no feedback. 


Tue Use or «a Puasr-Lockep Loop IN THE 
REFERENCE CHANNEL OF AN ANGLE-M®ASURING 
SYSTEM 


Fig. 7 is a simplified block diagram of a simultaneous- 
lobing angle-measuring system used to measure the 
azimuth angle of a remote signal source. Synchronous 
detection is used to discriminate against noise which may 
be produced by a second remote source. The relatively 
noise-free reference signal required by the synchronous 
detector is provided by a phase-locked loop. 

The reference channel is assumed to be fed by a non- 
directional antenna. On the other hand, the measurement 
channel is fed by an antenna designed to produce an 
output A,, proportional to the azimuth angle 6,, with a 
phase reversal when the azimuth passes through zero, 
as shown in Fig. 8. The input from the antenna to the 
reference channel is 
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x SIGNAL SOURCE 
Bae 
-x NOISE SOURCE 
8 neal 
ie : 
Y A, sin (wit+di) + N, f (t) 


Am (8s) Sin (wt +d,) + Nw (On) f (t) 


BROADBAND 90° 
PHASE SHIFTER 


sin (wit + Ayco) 


LOW-PASS 
FILTER 


EASUREMENT CHANNEL 


COS (wit + Aveo) 


VOLTAGE 
CONTROLLED 
OSCILLATOR 


AMPLIFIER 
GAIN =G 


Ve 
REFERENCE CHANNEL 


g. 7—Phase-locked loop used to provide reference channel for 
angle-measuring system. 


vo, = A, sin (wt + d,) + Nif( (31) 


| 


nd the antenna and reference inputs to the measurement 
hannel are 


Um = A,,(6,) sin (wt + A,) + N,(On) f(b) 
Vv, = sin (w,t + X,,,)- 


(32) 
(33) 


n the absence of noise, (V, = N,, = 0) the steady-state 
utput of the vco will be cos (w,t + X,). The reference 
yput to the angle-measuring multiplier will then be sin 
ot + X;), making the de component of the multiplier 


Am (@) 


Fig. 8—Typical form for A»(@). 


utput just A,,/2. The factor A,, has the form given in 
ig. 8, so that the output of the multiplier measures the 
ngle of arrival of the signal sin (w,¢ + d,) provided there 
;no noise. Noise at the input to the reference channel 
roduces a phase-jitter of the vco output. If this phase 
tter results In a component in the vco output which is 
orrelated with the noise in the measurement channel, v; 
ill contain a spurious de component owing to the noise. 
‘he results of the previous sections will be used to compute 
nis spurious component. 

To find the output of the angle-measuring multiplier 
hen noise is present at the inputs of both channels, the 
roduct v,v,, is formed and is expanded in a power series 
1 J. In this application both J and are defined for the 
ference channel by J = 2N,/A,; and AX = Ayo — Aj- 
Jhen the coefficients of J are collected in the resulting 
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expression for the output of the angle-measuring multiplier, 
the result is 


_ Anf8) 


Us = ror + N f(t) sin (wt 4 ri) 


+ JN,Aif(t) cos (wt + X,). (34) 


The multiplier is assumed to contain a filter which rejects 
periodic components at the frequency 2w,. The first term 
in (84) represents the desired output. The second term 
has no de component and can be made as small as desired 
by low-pass filtering of v;. The term JN,,.,f(t) cos 
(w,t + ;) contributes a de error; the de arises from the 
multiplication of ),, a filtered version of f(¢) cos (w,f + ,), 
by f() cos (wt + 2,;) itself. Thus the steady-state error 
Th Bie we 

ui 


Sees oi 1, 


Usave 


{aN ik HOS) 


-cos [wi(t — 7) + Aj] ah) cos (w;t + 2,) dt (35) 


and if f(é) contains no periodic components, 


ee ne 
5avg —_ 2 


/ GAG iCon oad (36) 
0 

where h(r) is the inverse transform of H(S) and ¢,, is the 
autocorrelation of f(é). For noise with an autocorrelation 
function 


sin Awr 
bss(7) = oF A 
WT 


COS WT (37) 
(produced by passing white noise through an ideal band- 
pass filter of bandwidth 2Aw centered at w,) 


f so 
PING Ie i sin Awr (38) 
0 


merce ce (1 + cos 2w,7)h(7) dr. 


Usave ae 


If H(S) has the form given by (5) with parameters chosen 
for good response to transient charges in input phase, 7.e., 


N/ Qos ae Wx 


H(S) = = 39) 
S? + V 2wyS + wx : 
the value of h(r) is 
hb) = A} Dames eee cos 9 ONT (40) 
using this specific form for h(7) in (36) gives 
JN a, = 
Usave — Tae A/ Quon 
| sue ae WyT AT 
0 i 2 
INO; . /x 
Ste AKG V Qe 
: / Seer ee PONT CGR x2 wyT COS 2w,7 dr. (41) 
0 Ts 


In the practical case where the noise bandwidth Aw and 
the loop bandwidth w, are small compared to the carrier 
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frequency w,, the second term may be neglected in com- 
parison to the first term. Using the known value of the 
first term gives” 


| = Aw 
Us v2 say | Taretan Re en (42) 
oave 9 A, m Aw 2¢ re ; E ey 
L i 
when 
ees < il 
Wn 
NG) | 
V/2 01 WN f 1 v2 Wi 
Vsave = 5 4,7" Aw | 2 + $arctan n (42a) 
= (69) 
i ! oa (2) ~ 
when 
Aw at 
On 


For Aw «K wy (noise bandwidth much less than loop 
bandwidth) 


Lio 
DN 
and for Aw > wy (noise bandwidth much greater than 
loop bandwidth) 


(43) 


Usave — 


/23 o; WN 
Df OTA fake 


Vsavg = (44) 
where o, = N,o,; and o, = N,,o; are the rms noises in 
reference and measuring channels, respectively. The error 
in the measurement of the null point in the antenna 
pattern produced by the presence of correlated noise at 
the inputs of both the reference and measuring channels 
is thus a function of: ¢,/A,, the ratio of rms noise to peak 


°The principal branch of the arctan function, running from 
arctan — © = —7/2 through arctan 0 = 0 to arctan + © = 7/2 
should be used in evaluating (42) and (42a). 
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OC Output Produced 
by Cross-Correlation Between 
Noise in Reference and 


Measurement Channels 


ovg 


DC OUTPUT vs 


oO. 1.0 10 100 
BANDWIDTH RATIO Aw/wy 


Fig. 9. 


signal in the reference channel; o,,, the rms noise in the 
measurement channel; and w,/Aw, the ratio of loop 
bandwidth to noise bandwidth. A plot of v2... VS Aw/wy 
for (o,/A;)om = 1 is given in Fig. 9. The voltage v;,,, 18 
converted to a phase error by dividing it by the calibration 
constant of the measurement channel, K, volts-per- 
angular mil. 


CoNCLUSION 


A method has been derived for determining the phase 
modulation produced in the output of a phase-locked loop 
by noise at its input. The method has been shown to be 
useful in predicting the steady-state phase offset produced 
by cross-correlation effects within a single loop and in 
predicting the errors which may occur when the loop is 
used to provide a reference channel for an angle-measuring 
system. The analysis presented in this paper permits the 
phase modulation resulting from input noise to be com- 
puted to any desired accuracy (at the cost of increasing 
analytic difficulty); a physical interpretation has been 
included as a guide to choice of a model sufficiently 
complex to explain the phenomenon sought, yet sufficiently 
simple to permit a solution to be found. 
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A Note on the Sampling Principle for 


Continuous Signals’ 


A. V. BALAKRISHNANT 


Summary—Two sampling (integral interpolation) theorems for 
itinuous signals (continuous parameter stochastic processes) are 
yved. The first of these is the sampling principle introduced by 
annon, precise formulation or proof of which has not appeared 
herto. Obtained as a secondary result in this connection is a 
1eralization of a result on the spectra of sampled signals given by 
nnet. The second theorem is a stochastic version of the Newton- 
uss interpolation formula as representative of a different class 
sampling theorems. 


"N THIS NOTE we state and prove two “‘sampling” 
theorems for continuous signals (continuous para- 
meter stochastic processes). The first of these is the 

mpling principle of Shannon* and although it plays a 
ndamental role in the information theory of continuous 
mals, no precise (stochastic) formulation or proof has 
peared in the literature. The second is a stochastic 
rsion of the Gauss-Newton interpolation formula for 
nrandom functions. 


THE SAMPLING PRINCIPLE OF SHANNON 


veorem 1 


Let x(t)? —2 < t < o, bea real or complex valued 
y»chastic process, stationary in the ‘‘wide sense’’* (or 
econd order” stationary) possessing a spectral density 
ich vanishes outside the interval of angular frequency 
- 22W, 22W]. Then x(t) has the representation: 


‘ = n \sin r(2Wt — n) 
x) = lim Dy (25) m(2Wt — n) (1) 


n=—0 


r every ¢t, where lim stands for limit in the mean square 
4 
nse. 


root 


The essence of the proof is that the right side of (1) is 
e best linear estimate, in the mean square sense, of 
ft) in terms of {a(n | 2W)}, with zero estimation error. 
1e sampling principle per se for nonrandom functions 


* Manuscript received by the PGIT, November 23, 1956. 

+ University of Southern California, Los Angeles, Calif. 

1C, H. Shannon, ‘‘A mathematical theory of communication,”’ 
ll Sys. Tech. J.; August, 1948. 

2Throughout this note we assume all processes have finite 


riances and means. Further, all equalities involving random 
riables are understood to be with probability one. 
3 J. L. Dobb, “Stochastic Processes,” John Wiley and Sons, Inc., 
ww York, N. Y.; 1953. 
4 More explicitly this means. 
N : 2 
; n \sin r(2Wt — n) | \ 

= 40 ; ; = 0. 
a ail Hl) — 2, is) (2W, — n) 
As is known, this implies convergence in probability as well. 


is applied to the covariance function® R(t) of the process 
to yield: 


_ pln sina CW 
Pata > Ae ?) TOW a 


Thus, let x’ (¢) be the best linear estimate of x(t) in 
terms of {a(n/2W)} in the mean square sense. Then 


7 (1) = lim » ea a,(t) 


where the a,(¢) satisfy the discrete version of the Wiener- 
Hopf equation: 


i a)- Eat gEew 


for every m and every t. We note that the series need not 
converge absolutely. Since R(t) is the Fourier transform 
of the spectral density known to vanish outside [— 27W, 
+ 27rW], it follows from the ‘sampling’’ principle 
commonly attributed to E. T. Whittaker’ (integral 
interpolation) that (3) is satisfied with 


sin r(2Wt — n) 
m2Wt — n) 


a(t) = 


We shall actually prove (2) here, since the oft-quoted 
proof given by J. M. Whittaker® leaves room for doubt 
as to whether or not R(f) has to be further restricted for 
(2) to hold. Thus, let g(f) be the spectral density of the 
process so that 


R(t ——_ 1) = fesse tee df. 


2Zrift 


Now, e as a function of f can be expanded in a Fourier 
series in [—W, + W] to yield: 


Cre = ss Leo (4) 
—o© 


for every f in the open interval (—W, W) where the 
Fourier coefficients {a,} can be expressed: 


1 
ift —i W 
ips 5 i eke imnf/ df 
yf) —W 


sin r(2Wt — n) 
~ e(2Wt — n) 


5K. T. Whittaker, ‘On the functions which are represented by 
the expansions of the interpolation theory,”’ Proc. Roy. Soc., Edin- 
burgh, vol. 35; 1915. 

6 J. M. Whittaker, “Interpolatory Function Theory,’’ Cambridge 
University Press, Cambridge, Eng.; 1935. 
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Qrift 


The important fact about (4) is that since e is abso- 
lutely continuous in (—W, + JW), it follows’ that the 
series in (4) converges boundedly to e’*'’' in the open 
(—W, —W). Hence by the Lesbesgue con- 
vergence theorem,’ we obtain 


interval 


N 
bey ae" rn A wi of) df 


R(t — 7) = Limit 
—-W -N 
= limit ace “a (5) 
No = 
since 
n ‘ile 
dee ve *) ~ ih CaS LETS GS) df 


establishing (2) and, in particular, (3), as required. 

We next prove that the error in estimation is zero. 
Since x(t) 1s the optimal estimate, the mean square 
error: 


E{| a(t) — «*(# [7] 
= Bi(a(t) — «*(O))(@)] 


= R(0) — E[x*(t)x(d)] 

sin 7(2Wt — n) 
OS SR J ) n(2Wt — n) 
= 0, 


using (3). This concludes the proof of the theorem. 

As far as most applications are concerned, Theorem | is 
probably adequate. It is possible, however, to relax the 
requirement that the process have a spectral density. 


Theorem 2 


Let x(t), —° < t < o, bea real or complex-valued 
stochastic process, second order stationary, having a 
spectral distribution @(f) such that 


1) (e + i! : d&(f) = 


2) (f) is continuous at + W. 
Then x(t) has again the representation (1). 


Proof 


If we proceed as in Theorem 1, we have only to establish 
(2) again. However, since @(f) is continuous at + W, it 
follows that the series in (4) converges boundedly almost 
everywhere with respect to the Lesbesgue-Stielties 
measure induced by ®(f), so that the Lesbegue con- 
vergence theorem can be invoked again to obtain (5). 

It is easy to demonstrate that the theorem is false if 
$(f) has a jump at + W. We have only to take 


a(f) = cos (2xWi + @) 
7. C. Titchmarsh, “Theory of Functions,’ Oxford University 


Press, Oxford, Eng., p. 408; 1950. 
8 [bid., p. 345. 
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where @ is a random variable with uniform probability 
density in 0 < 6 < 2z. For, the right side of (1) in this 
case 1S 


—o 


a n — 
lim os, = [z a, COS ne | cos 6 


[cos 2rWt] cos 0 


t~ ae. 


Sanh) hie ye OW 


The best we can state for the case where the spectral 
distribution has a jump at one or both end points is 
given below. 


Corollary 


Let a(t) be a real or complex-valued second-order 
stationary stochastic process having a spectral distribution 
@(f) such that for some W, > 0, 


Oo), — DOW o) ee 


Then the mean squared error in the representation of 
x(t) as 


is given by 
R(O)[(jump of &(f) at + Wo) 
+ (jump of ®(f) at —W,)] sin? 2xWot. 


However, for every W > W,, (1) is valid again with zero 
error. 


Proof 


The proof is omitted since it hinges merely on the 
equality 


> a, cos nr = 


-o 


cos (21W t) 


as far as the first part is concerned, and the second part 
follows from the theorem. 

The inverse problem of obtaining a continuous signal 
from a discrete signal leads to a useful converse of Theorem 
2 (and 1). 


Theorem 3 


Let z%,, — © <n < o, be a discrete parameter real 
or complex valued second-order stationary stochastic 
process having a spectral distribution which is continuous 
at + 4. Let 


ann s : sin r(2Wt — n) 


at) = Tim 22 Fone a 


(1a) 
for every t, — © <t< o. Then 2(t) is a second-order 
stationary process having a spectral distribution satisfying 
conditions 1) and 2) of Theorem 2. 


oof 


Tt 1 is best to use the spectral representation theorem? 


1/2 : 
ln = / eae dy(n) 
J -1/2 


vere’ y(N) is a (suitable normalized) orthogonal process 
th 


Bll dyQ) |*] = dGQ) 


ere G(d) is the spectral distribution of the process. 
tting 


eet sin r(2Wt — n) 
, m(2Wt — n) 


- first show that the indicated limit in (la) exists. The 
ch partial sum therein can be written: 


Lz: iN 
Ln, = i De ,e° rind dy(X) 
A/V 


Ww 


= i S Oe dy(f/2W). 


ice G(f/2W) has no jumps at + W, it follows that 
Ya, e™'"’” converges boundedly almost everywhere 
th respect to the measure dG(f/2W) to e?"*’ for 
W < ff < W, and hence also in the mean square sense. 
1e process y(A) being orthogonal, the mean square 
avergence of the series in (la) follows from this. Further 


» have: 


“d= f ett dy(f/2W). 


om the spectral representation theoren for continuous 
rameter processes it follows that «(¢) is second-order 
utionary and that the covariance function R(t) of the 
ocess is given by: 


RW) = A 2 agit /2W). 


oreover G(f/2W) satisfies conditions 1) and 2) of 
1eorem 2, and further, therefore, R(t) satisfies (2). 
That the theorem is false if the x, process has a spectral 
‘tribution with a jump at + W is demonstrated by 
e following example. Let 


XL, = cos (nm + 6) 


ere 6 is a random variable with uniform probability 
psity in 0 < 6 < 27. Then 


sin r(2Wt — n) 
oe Le (2Wt — n) 


ich is clearly not stationary. 

By way of corollary to this theorem we shall obtain a 
1eralization of a result derived by Bennet’ by a different 
thod. 


= [cos 2rWt] cos 6 


§ Doob, op. cit., p. 481. 
uo W. R. Bennet, “Methods of solving noise problems,” 
H, vol. 44, pp. 609- 638; May, 1956. 
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Corollary 1 


Let the discrete process x, of the theorem be derived 
by sampling from a continuous parameter process y(t) 
so that 


= y(n/2W,), Wal 


where the y(#) process is second-order stationary whose 
spectral distribution ®(f) has no jumps at + 2 KWo, 
hee 2 een ok hone 


nie Ses sin r(2Wt — n) 


LOM NGS Gy 


the x(t) process is second-order stationary, with covariance 
function R(f) given by 


w 
RO = fe" dulf) 
where 


Wof 


du(f) = 2) ao( eet 2K). 


Proof 


We first derive an expression for the spectral distri- 
bution of the x, process. For this we note that 


E[(2minEm)| = ‘| (2 tint/2Wo) d®(f) 


and since 


ee a Wo. rev plat ree =) (f+2KWo) 


) 


for every integer k, we can write 


El(Xm+nXm)] os ioe PA ETIN 5) dy(f) (7) 


where 


avi) = D> dey + 2KW.), 


the necessary convergence being easily verified. By a 
change of variable in (7) we have 


ey 1/2 
Hic ie [ (oe aa) 


where 


dG(A) = dy(2W,d). 


Moreover, the x, process satisfies the conditions of the 
theorem, so that the x(f) process is second-order stationary. 
Further, the covariance function R(t) is given by (6), 
so that 


R(t) 


I 


i ent! aG(f/2W) 


Ww F | 
2Qrift aa( Mal - 
jee x W 


proving the corollary. 


ll 
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Perhaps a more useful form of this result is obtained 
if we require that the y(!) process have a spectral density. 


Corollary 2 


Let the y(¢) process of Corollary 1 have an absolutely 
continuous spectrum with density g(f). Then z(t) as 
defined in Corollary 1 is second-order stationary without 
any additional conditions on y(t), and further, has an 
absolutely continuous spectrum with density defined 
(almost everywhere with respect to Lesbesgue measure) 


by: 


0 otherwise. 


Wo 


pt w wiles esis ali 


Proof 


The main step that remains in the proof is to show that 


ie ey te pe 2KW,) df 
le 


and this follows from the fact that each g(Wof/W + 
2kW,) is nonnegative, and standard Lesbesgue integration 
theory. 


We ae Se 2KW,) df 


NerwTon-GAuss SAMPLING THEOREM 


Besides the Cardinal series just discussed, the well- 
known integral interpolation formulas of Gregory, 
Newton, Gauss, Sterling, and Bessel also can be shown 
to have stochastic analogs. Here, however, we _ shali 
consider only the Newton-Gauss formula’ since, as in 
the Cardinal series, it begins with an intermediate value 
and uses differences on either side of this value. 


Theorem 4 


Let x(t) be a real or complex valued stochastic process 
satisfying the conditions of Theorem 2. Then «(f) has the 
representation: 


x(t) = lam bs 


fe “ j vob 


ire aa 


= 


ae (' a “ a'x(—20} 
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5 
end ‘ 
for every t, where, 
h = 4w 
Aa(nh) = x(nh + h) — «x(nh) 
A***x(nh) = A[A*x(nh)] 
regs) _ (t+mh)(t+mh—h)---(t-+-mh—rh+h) 
r - r! ‘ 


Proof 


The essential idea of the proof is again that the righ’ 
side of (8) is the optimal linear estimate, with zero error 
of x(t) in terms of the random variables of the form 
{A"z(— nh)} which are themselves linear combina: 
tions of the random variables {a(nh)}. It is possible te 
shorten the proof by using Theorem 2. First we note tha’ 


E[Aa(— mh)x(nh)]=R(— mh — h — nh) —R(—mh — nh) 


= AR(—mh — nh) 


where R(t) is the covariance function of the process 
Further, by induction, we have 


E[A*x(—mh)x(nh)] = A*R(— mh — nh). (9) 


Now by Theorem 2, x(t) for each ¢, belongs to the closure 
(in the mean square sense) of the linear space generatec 
by the random variables {a(nh)}. Hence to establish 
(8), it is enough to show that, using (9), for every integer ™ 


R(t — mh) = | e(— mn 


t= limes 


+{(' x "VR —mh — h) + a an an A’R(—mh — 21} 


a 


However, by a result due to Steffenson-Ferrar-Whittaker,* 
the convergence of (2) implies (10). Since from Theoren 
2, (2) is known to hold, this proves the Theorem. 


+f ARC 


(10 


J 


uJ. M. Whittaker, op. cit., p. 262. 


CZ 
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A Note on Some Statistics Concerning Typewritten 
or Printed Material’ 


SID DEUTSCH} 


HIS NOTE discusses some of the statistics 
[' associated with a sample of typewritten or printed 

material. The sample is shown in Fig 1. In the 
rtical direction, the sample includes 5 rows of type 
d the equivalent of 5 spaces between the rows of type. 
The vertical lines of Fig. i approximately represent 
e scanning lines for facsimile transmission. Although 
e original sample consisted of printed material (m, 
r example, occupies much more space than 7), the size 
the letters is approximately equal to that of type- 
‘itten copy. Some minor differences between printed 
id typewritten copy will exist because typewritten 
tters fall underneath each other, so that the spaces 
tween the letters tend to form continuous columns. 
For statistical purposes, it is assumed that the sample is 
mposed of discrete resolution elements that are square. 
irthermore, it is assumed that the signal is quantized 
that if the area of a square is over 50 per cent black, 
e entire square is black, etc. The quantized sample is 
own in Fig. 2, and corresponds to the signal obtained 
th a sampling and binary quantization process. 
The highest probabilities of black or white messages 
e listed in Table I. 


TABLE I 
Vlessage Direction Length of Per Cent 
of Sean Message, Boxes Probability 
ack Vertical 1 60.5 
ack Vertical 2, 14.3 
ack Vertical 3 6.17 
ack Vertical 7 6.17 
hite Vertical 1 1228 
hite Vertical 2 12.8 
hite Vertical 8 14.1 
ack Horizontal 1 Ale i 
ack Horizontal 2, 39.4 
ack Horizontal 3 9.21 
hite Horizontal 2 26.8 
hite Horizontal 3 27.8 
hite Horizontal 4 14.3 


The entropy of a block of N symbols was also con- 
lered.’ If N = 2, for example, the possible sequences 
sare as follows: B, = 00, B. = 01, B; = 10, and B, = 11. 


* Manuscript received by the PGIT, December 27, 1956. This 
per is a summary of Rep. R-526, Microwave Res. Inst., Brooklyn, 
Y., October, 1956. The project was supported by the Rome Air 
vv. Ctr., Rome, N. Y. 

+ Polytechnic Inst. of Brooklyn, Brooklyn, N. Y. 

1C, E. Shannon and W. Weaver, ““The Mathematical Theory of 
mmunication,”’ University of Illinois Press, Urbana, Ill., p. 25; 
49), 


A “zero” in the sequence represents a white resolution 
element, etc. If P(B,) is the probability with which 
sequence B; occurs, then the entropy per resolution 
element using blocks of N symbols is given by 


Gy= [P(B,) log, P(B) + P(B2) log, P(B.) + ---]. (1) 


1 
N 

The sample being considered here contains 5025 
resolution elements. In the limiting case, where N = 5025, 
Gso25 18 equal to H, the entropy per resolution element. 
As N is increased, Gy approaches H. 

In counting blocks of symbols, the last box in each 
line was assumed to be grouped with the first box in the 
next line. This minimizes the effects of the borders on the 
sample. 

A better approximation to H is given by the conditional 
entropy of the next symbol when the (NV — 1) preceding 
symbols are known. This is given by 


ie = NGy - (N a, NGy-1 ° (2) 


(li Tit 
aa 
1" il Cent ia 


Ce | FF 
aw 


pPTeCcririca. 
| , TATE LAH 


Eigaele 
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TABLE II 
Vertical Scan Horizontal Scan Bidirectional 
N Scan 
Gy Fy Gy Fy Gy 
1 0.6844 0.6844 0.6844 0.6844 = 
2 0. 6234 0.5624 0.6442 0.6040 _- 
3 0.5990 0.5502 _- — 0.6107 
a 0.5601 — — — —- 


box long, B, = 2 boxes long, ete. Then P(B,) is th 
probability with which black messages 1 box long occur 
etc.; N, in this case, is unity, since each black length 1 
treated as an individual message. If these changes ar 
made, (1) gives the entropy per black message (or 
similarly, per white message.) The results here are givel 
in Table III. 


TABLE III 
Message Direction 
of Sean bits/message 
Black Vertical 2.002 
White Vertical 4.132 
Black Horizontal IL Sil 
White Horizontal Seno: 


Fig. 2. 


A calculation was also made of G, for bidirectional 
scan; that is, each block of 3 symbols was arranged as 
follows: 


Numbers 1, 2, and 3 represent the first, second, and 
third digit of each possible sequence. 

The various Gy and Fy values are summarized in Table 
ie 

Eq. (1) gives the entropy per resolution element using 
blocks of N symbols. Suppose that the B; symbols 
represent the various lengths of black, such as B, = 1 


On an average, then, vertical scan requires 3.06% 
bits/message. Out of the total of 5025 boxes, there ar 
810 black and white messages. This corresponds to 6.20: 
boxes/message. Potentially, there is a time-bandwidtl 
saving here of 6.203/3.067, or 2.022. 

On an average, horizontal scan requires 2.52 bits, 
message. There are 934 horizontal black and whit 
messages, or 5.38 boxes/message. Potentially, there is : 
time-bandwidth saving horizontally of 2.135. 

One can conclude that a 2 to 1 saving is possible witl 
typewritten or printed material. A code that could yiel 
this much compression would be very complicated, bu 
relatively simple codes should be capable of saving 
1.5 to 1. 
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A Note on the Construction of a Multivariate 


Normal Sample’ 


G. MARSAGLIA 


ummary—This note points out the superfluity of a method of 
n and Storer for constructing a multivariate normal sample, and 
zests a simple alternative. 


NTEIN and Storer’ have recently discussed the 
y problem of constructing samples having a specified 

multivariate normal distribution. They are 
parently unaware of simple facts concerning the 
avior of the covariance matrix under a linear trans- 
mation. If € =(%, --- , #,) is al X n vector whose 
yponents are random variables having covariance 
trix C, and if M is an m X n matrix of constants, 
n the covariance matrix of the components of &M 
M'CM. Thus, if x,, ---, x, are independent, normally 
tributed, with variances 1, the components of &M 
| be jointly normally distributed with a specified 
ariance matrix S if, and only if, 


M’M = S. (1) 


ne need not go to such extravagant lengths as the 
hogonal diagonalization of S in order to solve (1). 
rious well-known elementary procedures may be used. 
ere are exactly 2” triangular (zeros below the diagonal) 
< nm matrices M which solve (1), for if S, is the matrix 


* Manuscript received by the PGIT, October 15, 1956. 
S. Stein, and J. E. Storer, “Generating a gaussian sample,” 
1 Trans. vol. IT-2, pp. 87-90; June, 1956. 


of elements common to the first k rows and columns of 
S, and if S,,, is partitioned so: 


Si41 = Gi a) 
Ay diy, 


then sequences M,, M., --- and M,;*, M;", --- such that 
M;M, = S, may be constructed by the relations 


free (s ) cee ice Beat | 
rel ae ) Joni = 
0 b 0 bj 


where 6, = a," and b; = d, — 6,8. There are exactly 
two choices for b,, since S;,, 1s positive definite, and 


(GM, ", ~ L)SpG pee ra bg a di. a B.Bz, 5 (Ue 


We apply this procedure to the case discussed.’ If 
e"''' is the covariance function of a stationary normal 
stochastic process y,, and iff, <t, < --- <?, are “time” 
points, not necessarily equally spaced, the covariance 
matrix S of y,,, ---, yr, Will have 7, jth element e*!"**"'. 


A solution to (1) in this case has 7, jth element 


AC j<i 
Mi; = 6. ee 1 ae 1 <j 
ei Ss Creer eaten, 1 < L = j. 
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ermination of Redundancies in a 
of Patterns* 


the above article,! Glovazky states 
the minimum number of cells that 
, be scanned to uniquely identify his 
ibhitrary patterns is four, namely, 1, 
8. ‘ 
would like to submit that the patterns 
be uniquely identified by scanning 
r cells 3, 4, and 9 or 3, 5, and 9—in 
r case one cell less than his “minimum 
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hor’s Comment 
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nted in the article does not represent 
absolute minimum number of cells. 
ras meant to be conveyed by the last 
of the paper, the number of cells 
h appears in the reduced schedule 
ads on the particular “scanning path,”’ 
on the particular sequence of columns 
mn. The significance of the proposed 
iod is that, regardless of the sequence, 
can always be sure that the resulting 
Jule will not contain more than P-1 
This by itself constitutes a great 
ovement in many practical cases 
e the number of cells greatly exceeds 
vumber of patterns (C > P). 
ie problem of finding the absolute 
mum schedule (or schedules) is by 
- of magnitudes more difficult, because 
. is no @ priori knowledge of what 
mn sequence will yield the most 
ent reduction. The optimal sequence 
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mum schedule involves, to the author’s 
ledge, examination of all possible 
an sequences—a task which in most 
is too laborious to be of any practical 
In such cases the simplicity and 
yactness of the separation scheme 
ly outweighs the limitations mentioned 
e. 
ARTHUR GLOVAZKY 
Raytheon Mfg. Co., 
Waltham, Mass. 
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Fig. 1. 


Letter from Dr. McCluskey?* 


In connection with the article by 
Glovazky,! I would like to point out a 
method for determining the redundancies 
in a set of patterns without a scanning 
path being assumed. This method should 
lead to a specification of the fewest cells 
which must be scanned or to all alternative 
sets of cells which can be scanned to 
identify the patterns. 

I will assume that the set of patterns is 
specified by a code matrix, Fig. 2, (this is 
the transpose of Glovazky’s code schedule). 
Each row of this matrix represents a cell 
and each column a pattern. If the 7th cell 
of the jth pattern is black, the entry in the 
ith row and jth column of the matrix 
(ai;) is 1, otherwise it is 0. From this 
matrix another matrix called the pair 
matrix, Fig. 3, is to be formed in which each 
column is derived by taking the sum 
modulo two of (see Fig. 4) the elements 
from a pair of columns of the code matrix. 
This is to be done for all columns of the 
code matrix so that if the code matrix has 
r rows and m columns, the pair matrix 
will have r rows and mC2 columns. The 
column of the pair matrix derived from 
columns j and k of the code matrix will 
have a | in only those rows which corre- 
spond to cells in which the jth and kth 
patterns differ. In order to distinguish 
between the jth and kth patterns, at least 
one cell for which the corresponding row 
has a 1 in the jk column of the pair matrix 
must be scanned. In order to distinguish 
among all pairs of patterns, a set of cells 
such that the corresponding rows have at 
least one 1 in each column. of the pair 
matrix must be scanned. Thus the sets of 
cells to be scanned can be determined by 
discovering each set of rows which has the 
property that every column has a | in at 
least one row of the set. 


sReceived by the PGIT, March 26, 1957. 


There are several techniques known for 
picking such a set of rows. This is exactly 
the problem which arises in minimizing 
Boolean functions; the pair matrix corre- 
sponds directly to a prime implicant 
table.4 

For the pair matrix the following pre- 
liminary reduction is possible: if there are 
two columns 7 and k& such that column k 
has a 1 in all rows in which column j has 
a 1, column k& can be deleted. In Fig. 3 
column 4, 5 has 1’s in every row in which 
column 5, 6 has 1’s, therefore column 4, 5 
can be deleted. Satisfying the require- 
ments for column j7 will automatically 
satisfy those for column k. Application of 
this method to the set of patterns given in 
Glovazky’s paper gives two scanning 
sequences containing only three cells: 
(3, 4, 9) and (8, 5, 9). 

While this technique is longer than that 
using the code schedule and reduced code 
schedule,! it considers all possible scanning 
paths simultaneously and _ therefore is 
roughly equivalent to n! applications of 
the code schedule technique (where n 
equals the number of cells). It should be 
useful where all scanning paths are to be 
considered. 

Epwarp J. McCuuskky, JR. 
Bell Telephone Labs. 
Whippany, N.J. 


4This was first introduced by W. V. Quine, “‘The 
problem of simplifying truth functions,’’ Amer. Math. 
Monthly, vol. 59, pp. 521-531; October, 1952. 
Techniques for selecting the sets of rows are given 
by E. J. McCluskey, Jr., ‘Minimization of Boolean 
functions,’ Bell Sys. Tech. J., vol. 35, pp. 1417-1444; 
November, 1956. 


Author’s Comment? 


McCluskey’s pair matrix presents a 
new formulation of the reduction problem, 


5Received by the PGIT, April 20, 1956. 
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Fig. 2—Code matrix for code schedule given by Glovazky.! 
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Fig. 3—Pair matrix derived from code matrix of Fig. 1. 
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Fig. 4—Sum modulo two. 


but by no means simplifies it. In the origi 
matrix the objective was to delete as m: 
rows as possible without destroying 
uniqueness of the columns; in the 1 
matrix the objective is to delete as ms 
rows as possible without including ¢ 
column which comprises entirely of ze! 
Wither of these objectives calls for a 
cedure which is too long and too invol 
to be useful. » 

The pair matrix contains P(P — 1 
columns as compared with P columns 
the original matrix. The  prelimin 
deletions alone require in the order 
P*/8 operations since they involve 
comparison of all possible column _p: 
in the pair matrix. The subsequent 
duction process requires even a greé 
number of operations, with the result t 
the total amount of labor involved 
roughly equal, as McCluskey no 
to that of repeatedly applying my sepé 
tion procedure to all possible colu 
sequences. 

Thus the pair matrix approach con 
tutes one of the general methods to wh 
I referred in my rebuttal to Riekems 
comments. In cases where the deletion 
C — P + 1 cells is satisfactory, the sep: 
tion method is definitely preferable. 


ARTHUR GLOVA! 
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